Reading Comma Delimited Files

Posted Tue, Aug 25 2009 12:14 by Deborah Kurata

There may be times that you need to read comma separated value (CSV) files into your application. For example, you obtain output from a legacy system or other application in a comma delimited text file format, and you need to read and use that data in your application.

NOTE: For more information on delimited files, see this link.

.NET provides several techniques for reading text files. This post focuses on how to read a comma delimited text file into a DataTable.

You may find it very useful to read your text file into a DataTable, whether or not you plan to use a database. Reading a text file into a DataTable not only saves you a significant amount of string manipulation coding, it also makes it easy to access the imported data from within your application.

For example, you can use binding to bind the resulting DataTable to a grid or other controls. You can use Linq to DataTables like in this example to manipulate the resulting data. All of the features of the DataTable are then available to you.

BIG NOTE: Many developers have ignored this technique because one look at the code and the developer assumed it is somehow associated with a database, it is NOT. This is referring to in-memory DataTable objects.

For this example, the text file appears as follows:

CustomerId, LastName, FirstName, LastUpdateDate
1,  Baggins, Bilbo, 20090811 
2,  Baggins, Frodo, 20090801 
3,  Gamgee,  Samwise, 20090820 
4,  Cotton,  Rosie, 20090821

Notice several things about this file:

  1. It is a comma separated value (CSV) file.
  2. The first line provides the column names. This is optional.

You can read this text file into a DataTable using OleDb as follows.

In C#:

string fileName = "testCSV.txt";
string dirName = Path.GetDirectoryName(Application.ExecutablePath);
DataTable dt;

using (OleDbConnection cn =
    new OleDbConnection(@"Provider=Microsoft.Jet.OleDb.4.0;" +
            "Data Source=" + dirName + ";" +
            "Extended Properties=\"Text;HDR=Yes;FMT=Delimited\""))
{
    // Open the connection
    cn.Open();

    // Set up the adapter
    using (OleDbDataAdapter adapter =
        new OleDbDataAdapter("SELECT * FROM " + fileName, cn))
    {
        dt = new DataTable("Customer");
        adapter.Fill(dt);
    }
}

In VB:

Dim fileName As String = "testCSV.txt"
Dim dirName As String = _
            Path.GetDirectoryName(Application.ExecutablePath)
Dim dt As DataTable

Using cn As New OleDbConnection("Provider=Microsoft.Jet.OleDb.4.0;" & _
            "Data Source=" & dirName & ";" & _
            "Extended Properties=""Text;HDR=Yes;FMT=Delimited""")
    ' Open the connection
    cn.Open()

    ' Set up the adapter
    Using adapter As New OleDbDataAdapter( _
            "SELECT * FROM " & fileName, cn)
        dt = New DataTable("Customer")
        adapter.Fill(dt)
    End Using
End Using

This code starts by declaring variables to hold the text file name, directory containing the file and the resulting DataTable.

This technique only works with a standard set of file name extensions (see the NOTE at the end of this post). The file can reside in any directory. In this example, the file resides in the same directory where the application is executed. But this is not a requirement.

The first using statement in the example code sets up the connection to the directory. It sets the Provider property to use the Microsoft.Jet.OleDb provider. The Data Source property defines the directory containing the text file. The Extended Properties define that the file will be Text ("Text"), it has a header (HDR=Yes), and it is in a delimited file format (FRM=Delimited). The Extended Properties must be within quotes, so double-quotes (VB) or slash quote (C#) are used to escape the included quotes.

The code then opens the connection, thereby opening the file. Since this code is in a using statement, the file is automatically closed at the end of the using block.

The second using statement sets of the DataAdapter by defining a Select statement and the open connection. The Select statement selects all of the information from a specific file as defined by the fileName variable.

The code then creates the DataTable, giving the table a name. In this example, the table name is "Customer".

Finally, it uses the Fill method of the TableAdapter to read the data from the text file into the DataTable.

Using the technique detailed here, you can view the resulting DataTable. The column headings were defined by the header in the text file. If you don't have a header, the columns will be giving a default name.

image

You can then access the data in the table as you access any other DataTable. For example:

In C#:

foreach (DataRow dr in dt.Rows)
{
    Debug.Print("{0}: {1}, {2} LastUpdated: {3}",
                dr["CustomerId"],
                dr["LastName"],
                dr["FirstName"],
                dr["LastUpdateDate"]);

}

In VB:

For Each dr As DataRow In dt.Rows
    Debug.Print("{0}: {1}, {2} LastUpdated: {3}", _
                dr("CustomerId"), _
                dr("LastName"), _
                dr("FirstName"), _
                dr("LastUpdateDate"))
Next

NOTE:

By default, this technique only works with .txt, .csv, .tab, and .asc file extensions. If your file name has a different extension, you can either change the extension in your code before reading the file, or you can update the Extensions key in following registry setting:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text

NOTE:

By default, this technique assumes you are working with ANSI text files. If that is not the case, you can update the CharacterSet key in the same registry setting:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text

Though this is not recommended.

VERY IMPORTANT NOTE:

If you test this sample code by creating a text file with Visual Studio, the resulting text file will be in UTF-8 format. You need to save the file into ANSI format. The easiest way I found to do this is detailed below.

Adding a Text File to your Project:

  1. Right-click on your project in Visual Studio.
  2. Select Add | New Item from the context menu.
  3. Pick Text File from the available templates and click Add.
  4. Type in the data for the test file or paste in the text from the example at the top of this post.
  5. Save the file within Visual Studio. This creates a UTF-8 formatted file.
  6. If you plan to use the directory of the executing application, set the Copy to Output Directory to Copy always in the properties window for the file.

Converting the resulting UTF-8 file to ANSI format:

  1. Right-click on the file and select Open With
  2. Select Notepad.
  3. Select File | Save As.
  4. Set the Encoding to ANSI and click Save.
Enjoy!
Filed under: , , , , , ,

Comments

# re: Reading Comma Delimited Files

Tuesday, August 25, 2009 2:58 PM by Reed Copsey

This seems like more work than using TextFieldParser directly.  You're giving up a lot of error checking, and adding extra overhead loading the odbc + ado.net drivers.  Why not just use msdn.microsoft.com/.../microsoft.visualbasic.fileio.textfieldparser.aspx ?

# re: Reading Comma Delimited Files

Tuesday, August 25, 2009 4:07 PM by Deborah Kurata

Hi Reed -

Thank you for visiting my blog and thanks for the link!

There are benefits of reading the data into a DataTable, such as the ability to then bind to controls such as a DataGridView.

Using the technique you suggested is another choice if the application just needs to process the file line by line.

Thanks again!

# Reading Comma Delimited Files: TextFieldParser

Tuesday, August 25, 2009 4:59 PM by Deborah's Developer MindScape

In my prior post, I covered how to read a comma delimited file into an in-memory DataTable. You could

# re: Reading Comma Delimited Files

Tuesday, August 25, 2009 5:13 PM by Deborah Kurata

Hi Reed -

I just added posts for using TextFieldParser to access a comma delimited value (CSV) file and a fixed length file. Thanks for the suggestion!

# re: Reading Comma Delimited Files

Monday, August 31, 2009 10:25 AM by Steve

I'm trying the C# solution and I keep getting an error that the Microsoft Jet Engine cannot open the file ''.  That only thing different that I'm doing is instead of a OleDbConnection, I have a string in it's place.  Does the directory name for the data source need to have a \ at the end?

# re: Reading Comma Delimited Files

Monday, August 31, 2009 1:50 PM by Deborah Kurata

Can you post this question along with a code snippet here:

social.msdn.microsoft.com/.../threads

I monitor this C# forum and here it will be easier to look at your code and have a "conversation".

Thanks for visiting my blog!

# re: Reading Comma Delimited Files

Wednesday, September 02, 2009 4:37 AM by radarek

Hey, one question regarding this code:

What about UTF-8 text file and the special characters?

I have tested it and this code doesn't read multibyte character - so isteed one 2byte character you will get two 1 byte characters. Oh and Byte Order Mask at the begining is also visible :)

# re: Reading Comma Delimited Files

Wednesday, September 02, 2009 10:02 AM by Deborah Kurata

Hi Radarek -

Thanks for stopping by the blog. Yes, this technology is a little limited. There is an OEM mode (instead of the ANSI mode), but I did not try it. If you give it a try, report back and let us know if it worked for you.

Hope this helps.

# re: Reading Comma Delimited Files

Tuesday, March 30, 2010 11:18 PM by Kevin

What if the .csv file has more than one header line (in my instance, I have two header lines before the delimited data begins)?

# re: Reading Comma Delimited Files

Sunday, May 16, 2010 10:54 AM by bespoke development

Wow - really nice and easy way to implement.

I used to write pages of code! :-)

thanks

andy

# re: Reading Comma Delimited Files

Friday, November 26, 2010 4:34 AM by Asif Patel Belgaum

Very useful, it solved my problem this is one of the best post for Reading Comma Delimited Files. I modified it a bit to read .csv File in ASP.net Program

Dim fileName As String = "testCSV.csv"

       Dim dirName As String = Server.MapPath("source_folder")

       Dim dt As DataTable

       Dim cn As New OleDbConnection("Provider=Microsoft.Jet.OleDb.4.0;" & _

           "Data Source=" & dirName & ";" & _

           "Extended Properties=""Text;HDR=NO;FMT=Delimited""")

       ' Open the connection

       cn.Open()

       ' Set up the adapter

       Dim adapter As New OleDbDataAdapter("SELECT * FROM " & fileName, cn)

       dt = New DataTable("Calls")

       adapter.Fill(dt)

       DataGrid1.DataSource = dt

       DataGrid1.DataBind()

# re: Reading Comma Delimited Files

Saturday, August 13, 2011 12:41 AM by Frenchie

Heck yeah this is exactly what I nedeed.

# re: Reading Comma Delimited Files

Sunday, August 14, 2011 9:11 AM by Jayce

Wow, that's a really clever way of thniknig about it!

Leave a Comment

(required) 
(required) 
(optional)
(required) 
If you can't read this number refresh your screen
Enter the numbers above: