Using Linq with Microsoft Word and Excel

Posted Fri, Aug 14 2009 13:14 by Deborah Kurata

Some of the collections in the Microsoft Office object models implement IEnumerable. The IEnumerable interface provides the ability to perform a for/each against the collection. With .NET 3.5, a Cast extension method of IEnumerable allows you to work with these collections using Linq.

Microsoft Word

For example, say you want to bind the set of open Word document names in a ComboBox.

First, set a reference to the desired version of the Microsoft Word Object Library from the COM tab of the Add Reference dialog. The resulting reference appears as Microsoft.Office.Interop.Word.

In C#:

// Add to the top of the code file
using Word = Microsoft.Office.Interop.Word;

// Add to a subroutine
Word.Application Wd;
Word.Document doc;
Word.Document doc2;
object missingValue = Missing.Value;

// Start Word and get Application object
Wd = new Word.Application();

// Define documents
doc = Wd.Documents.Add(ref missingValue,ref missingValue, 
                       ref missingValue,ref missingValue );
doc2 = Wd.Documents.Add(ref missingValue, ref missingValue, 
                        ref missingValue, ref missingValue);

// Use Linq to access the document names.
var query = from d in Wd.Documents.Cast<Word.Document>()
            select d.Name;
comboBox1.DataSource = query.ToList();

// Or use Lambda expressions
var query2 = Wd.Documents.Cast<Word.Document>().Select(d=> d.Name);
comboBox1.DataSource = query2.ToList();

// Close
doc.Close(ref missingValue, ref missingValue, ref missingValue);
doc = null;
doc2.Close(ref missingValue, ref missingValue, ref missingValue);
doc2 = null;
Wd.Quit(ref missingValue, ref missingValue, ref missingValue);

// Clean up
// NOTE: When in release mode, this does the trick
GC.WaitForPendingFinalizers();
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect() ;

In VB:

' Add to the top of the code file
Imports Word = Microsoft.Office.Interop.Word

' Add to a subroutine
Dim Wd As Word.Application
Dim doc As Word.Document
Dim doc2 As Word.Document

' Start Word and get Application object
Wd = New Word.Application

' Define documents
doc = Wd.Documents.Add
doc2 = Wd.Documents.Add

' Use Linq to access the document names.
Dim query = From d In Wd.Documents.Cast(Of Word.Document)() _
            Select d.Name
ComboBox1.DataSource = query.ToList

' Or use Lambda expressions
Dim query2 = Wd.Documents.Cast(Of Word.Document) _
                    .Select(Function(d) d.Name)
ComboBox1.DataSource = query2.ToList

' Close
doc.Close()
doc = Nothing
doc2.Close()
doc2 = Nothing
Wd.Quit()

' Clean up
' NOTE: When in release mode, this does the trick
GC.WaitForPendingFinalizers()
GC.Collect()
GC.WaitForPendingFinalizers()
GC.Collect()

In both of these examples, the code starts Word, creates two Word documents, uses either Linq or a Lambda expression to define a query and then binds the resulting set of document names to a Combo Box.

Notice the missingValue variable in the C# code that is not in the VB code. VB supports default parameters, but C# does not. So any time a parameter is defined for a Word method, C# must provide it. VB will use the default parameter values.

NOTE: A new feature in C# 4.0 (Visual Studio 2010) allows for default parameters in C# as well, dramatically simplifying the C# code that interacts with Word or Excel.

As another example, the following code retrieves all of the words from the defined Word document.

In C#:

var query = from w in doc.Words.Cast<Word.Range>()
             select w.Text;
comboBox1.DataSource = query.ToList();

In VB:

Dim query = From w In doc.Words.Cast(Of Word.Range)() _
             Select w.Text
ComboBox1.DataSource = query3.ToList

This code retrieves all of the words in the document defined by the doc variable. Instead of selecting the list of words, you could use any Linq feature such as finding only a specific set of words that match a criteria or counting the number of occurrences of a given word.

Microsoft Excel

This technique works with Excel as well. Say you want to bind the list of spreadsheets in an Excel workbook.

First, set a reference to the desired version of the Microsoft Excel Object Library from the COM tab of the Add Reference dialog. The resulting reference appears as Microsoft.Office.Interop.Excel.

In C#:

// Add to the top of the code file
using Excel = Microsoft.Office.Interop.Excel;

// Add to a subroutine
Excel.Application oXL;
Excel.Workbook oWB;
Excel.Worksheet oSheet;

// Start Excel and get Application object.
oXL = new Excel.Application();

// Get a new workbook.
oWB = oXL.Workbooks.Add(Missing.Value);

// Get the active sheet and change its name
oSheet = (Excel.Worksheet)oWB.ActiveSheet ;
oSheet.Name = "Test";

// Use Linq to access the spreadsheet names.
var query = from s in oXL.Worksheets.Cast<Excel.Worksheet>()
            select s.Name;
comboBox1.DataSource = query.ToList();

// Or use Lambda expressions.
var query2 = oXL.Worksheets.Cast<Excel.Worksheet>()
            .select(s => s.Name);
comboBox1.DataSource = query2.ToList();

// Close
oSheet = null;
oWB.Close(Missing.Value, Missing.Value, Missing.Value);
oWB = null;
oXL.Quit();

// Clean up
// NOTE: When in release mode, this does the trick
GC.WaitForPendingFinalizers();
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();

In VB:

' Add to the top of the code file
Imports Excel = Microsoft.Office.Interop.Excel

' Add to a subroutine
Dim oXL As Excel.Application
Dim oWB As Excel.Workbook
Dim oSheet As Excel.Worksheet

' Start Excel and get Application object.
oXL = New Excel.Application

' Get a new workbook.
oWB = oXL.Workbooks.Add

' Get the active sheet and change its name
oSheet = DirectCast(oWB.ActiveSheet, Excel.Worksheet)
oSheet.Name = "Test"

' Use Linq to access the spreadsheet names.
Dim query = From s In oXL.Worksheets.Cast(Of Excel.Worksheet)() _
            Select s.Name
ComboBox1.DataSource = query.ToList

' Or use Lambda expressions
Dim query2 = oXL.Worksheets.Cast(Of Excel.Worksheet) _
                            .Select(Function(s) s.Name)
ComboBox1.DataSource = query2.ToList

' Close
oSheet = Nothing
oWB.Close()
oWB = Nothing
oXL.Quit()

' Clean up
' NOTE: When in release mode, this does the trick
GC.WaitForPendingFinalizers()
GC.Collect()
GC.WaitForPendingFinalizers()
GC.Collect()

In both examples, the code starts Excel, changes the name of the active sheet, uses either Linq or a Lambda expression to define a query and then binds the resulting set of sheet names to a Combo Box.

Enjoy!

EDITED 11/16/09: Added information on setting the appropriate reference.

Filed under: , , , , , , ,

Comments

# re: Using Linq with Microsoft Word and Excel

Saturday, August 15, 2009 10:38 AM by Greg Duncan

Nice! Thanks for posting this.

I think you've just saved me hours of future hair-pulling-out... :)

# re: Using Linq with Microsoft Word and Excel

Monday, August 17, 2009 5:45 AM by Daniel Earwicker

Don't overlook the OfType method as an alternative to Cast. The Cast method will throw if the conversion is not allowed, whereas OfType will simply filter non-matching objects out of the sequence. So the OfType method is a bit like the 'as' keyword.

# re: Using Linq with Microsoft Word and Excel

Monday, August 17, 2009 10:17 AM by Deborah Kurata

Thanks for your suggestion, Daniel.

# Using Linq with Microsoft Word and Excel - Deborah Kurata

Monday, August 17, 2009 2:33 PM by DotNetShoutout

Thank you for submitting this cool story - Trackback from DotNetShoutout

# re: Using Linq with Microsoft Word and Excel

Tuesday, August 18, 2009 4:05 AM by Robert Bravery

Fantastic Post. I'm assuming you've tested with office 2007.

# re: Using Linq with Microsoft Word and Excel

Tuesday, August 18, 2009 10:07 PM by Deborah Kurata

Hi Robert -

Yes, I tested with Office 2007.

Thanks for visiting my blog!

# Links 2009-08-19

Wednesday, August 19, 2009 7:34 AM by Gunnar Peipman's ASP.NET blog

JQuery and web development Limit Number of Characters in a TextArea using jQuery Find out which Key was

# re: Using Linq with Microsoft Word and Excel

Monday, September 14, 2009 5:26 PM by David Sutherland

I just wanted to thank you for this post too.  I was just starting to do Linq to Word for a project and this posting came riding to the rescue.

Cheers;

Dave

# re: Using Linq with Microsoft Word and Excel

Wednesday, June 16, 2010 2:05 PM by Darin

Yep, it's me<g>

Great post. I'd run into this casting problem a couple times and hadn't seen a way around it.

This makes linq a whole lot more helpful with Word and Excel.

I've already found it incredibly handy for OpenXML work.

Good stuff!

# re: Using Linq with Microsoft Word and Excel

Saturday, August 13, 2011 12:36 AM by Jorja

Life is short, and this article saved vaualble time on this Earth.

# re: Using Linq with Microsoft Word and Excel

Sunday, August 14, 2011 9:26 AM by Randi

Super informative wrintig; keep it up.

# Simplifying enumeration of Office object model collections with Linq and Extension methods

Saturday, April 06, 2013 1:02 PM by Saveen Reddy's blog

I found an interesting post on Deborah's Developer MindScape   ( via Greg’s blog )and wanted to

Leave a Comment

(required) 
(required) 
(optional)
(required) 
If you can't read this number refresh your screen
Enter the numbers above: