Reading Fixed Length Files

Posted Tue, Aug 25 2009 13:34 by Deborah Kurata

There may be times that you need to read fixed length files into your application. For example, you obtain output from a legacy system or other application in a fixed length text file format, and you need to read and use that data in your application.

NOTE: For more information on fixed length files, see this link.

.NET provides several techniques for reading text files. This post focuses on how to read a fixed length text file into a DataTable.

You may find it very useful to read your text file into a DataTable, whether or not you plan to use a database. Reading a text file into a DataTable not only saves you a significant amount of string manipulation coding, it also makes it easy to access the imported data from within your application.

For example, you can use binding to bind the resulting DataTable to a grid or other controls. You can use Linq to DataTables like in this example to manipulate the resulting data. All of the features of the DataTable are then available to you.

BIG NOTE: Many developers have ignored this technique because one look at the code and the developer assumed it is somehow associated with a database, it is NOT. This is referring to in-memory DataTable objects.

For this example, the text file appears as follows:

000001  Baggins             Bilbo     20090811
000002  Baggins             Frodo     20090801
000003  Gamgee              Samwise   20090820
000004  Cotton              Rosie     20090821

Notice several things about this file:

  1. The columns are a fixed width.
  2. There is no header row that provides the column names. You could add column headers here if desired.

The first step in reading the file is to define a schema.ini file that defines the column widths. The file must follow these specifications:

  • The file must be called schema.ini.
  • The file must exist in the same directory as the text file.
  • The file must be in ANSI format. (See the note at the bottom of this post for information on saving a file to ANSI format.)

The contents of the schema.ini file for the example above is shown below:

[testFixed.txt]
ColNameHeader=False
Format=FixedLength
DateTimeFormat=yyyymmdd
Col1=CustomerId Text Width 6
Col2=LastName Text Width 22
Col3=FirstName Text Width 10
Col4=LastUpdateDate DateTime Width 8

The first line of the file is always the name of the associated text file enclosed in square brackets ([ ]).

The next set of lines define basic attributes of the text file:

  • ColNameHeader: In this case, there is no column header in the text file, so this property is set to false. The system will assume that the first line of the text file is the header unless you specify otherwise.
  • Format: In this case, the format is FixedLength. The system will assume comma delimited unless you specify otherwise.
  • DateTimeFormat: If you have a date in your file, you can specify the format here.

The last set of lines defines each column in the text file. The format of these lines are as follows:

Colx=ColumnName ColumnType Width ColumnWidth

See this link for more information on the contents of the schema.ini file.

You can then read the file using the following code.

In C#:

string fileName = "testFixed.txt";
string dirName = Path.GetDirectoryName(Application.ExecutablePath);
DataTable dt;

using (OleDbConnection cn =
    new OleDbConnection(@"Provider=Microsoft.Jet.OleDb.4.0;" +
            "Data Source=" + dirName + ";" +
            "Extended Properties=\"Text;\""))
{
    // Open the connection
    cn.Open();

    // Set up the adapter
    using (OleDbDataAdapter adapter =
        new OleDbDataAdapter("SELECT * FROM " + fileName, cn))
    {
        dt = new DataTable("Customer");
        adapter.Fill(dt);
    }
}

In VB:

Dim fileName As String = "testCSV.txt"
Dim dirName As String = _
            Path.GetDirectoryName(Application.ExecutablePath)
Dim dt As DataTable

Using cn As New OleDbConnection("Provider=Microsoft.Jet.OleDb.4.0;" & _
            "Data Source=" & dirName & ";" & _
            "Extended Properties=""Text;""")
    ' Open the connection
    cn.Open()

    ' Set up the adapter
    Using adapter As New OleDbDataAdapter( _
            "SELECT * FROM " & fileName, cn)
        dt = New DataTable("Customer")
        adapter.Fill(dt)
    End Using
End Using

This code starts by declaring variables to hold the text file name, directory containing the file and the resulting DataTable.

This technique only works with a standard set of file name extensions (see the NOTE at the end of this post). The file can reside in any directory. In this example, the file resides in the same directory where the application is executed. But this is not a requirement.

The first using statement in the example code sets up the connection string for connecting to the directory. It sets the Provider property to use the Microsoft.Jet.OleDb provider. The Data Source property defines the directory containing the text file. The Extended Properties define that the file will be Text ("Text"). The Extended Properties must be within quotes, so double-quotes (VB) or slash quote (C#) are used to escape the quotes.

If a schema.ini file exists in the directory defined as the data source and has a bracketed entry with the text file name, that .ini file is used to determine any other extended properties. So no other extended properties are defined in the connection string itself.

The code then opens the connection, thereby opening the file and the associated schema.ini file. Since this code is in a using statement, the files are automatically closed at the end of the using block.

The second using statement sets of the DataAdapter by defining a Select statement and the open connection. The Select statement selects all of the information from a specific file as defined by the fileName variable.

The code then creates the DataTable, giving the table a name. In this example, the table name is "Customer".

Finally, it uses the Fill method of the TableAdapter to read the data from the text file into the DataTable.

Using the technique detailed here, you can view the resulting DataTable. The column headings were defined by the header in the text file. If you don't have a header, the columns will be giving a default name.

image

Note how the date in the above screen shot appears as a standard date column.

You can then access the data in the table as you access any other DataTable. For example:

In C#:

foreach (DataRow dr in dt.Rows)
{
    Debug.Print("{0}: {1}, {2} LastUpdated: {3}",
                dr["CustomerId"],
                dr["LastName"],
                dr["FirstName"],
                dr["LastUpdateDate"]);

}

In VB:

For Each dr As DataRow In dt.Rows
    Debug.Print("{0}: {1}, {2} LastUpdated: {3}", _
                dr("CustomerId"), _
                dr("LastName"), _
                dr("FirstName"), _
                dr("LastUpdateDate"))
Next

NOTE:

By default, this technique only works with .txt, .csv, .tab, and .asc file extensions. If your file name has a different extension, you can either change the extension in your code before reading the file, or you can update the Extensions key in following registry setting:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text

NOTE:

By default, this technique assumes you are working with ANSI text files. If that is not the case, you can update the CharacterSet key in the same registry setting:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text

Though this is not recommended.

VERY IMPORTANT NOTE:

If you test this sample code by creating a text file with Visual Studio, the resulting text file will be in UTF-8 format. You need to save the file into ANSI format. The easiest way I found to do this is detailed below.

Adding a Text File to your Project:

  1. Right-click on your project in Visual Studio.
  2. Select Add | New Item from the context menu.
  3. Pick Text File from the available templates and click Add.
  4. Type in the data for the test file or paste in the text from the example at the top of this post.
  5. Save the file within Visual Studio. This creates a UTF-8 formatted file.
  6. If you plan to use the directory of the executing application, set the Copy to Output Directory to Copy always in the properties window for the file.

Converting the resulting UTF-8 file to ANSI format:

  1. Right-click on the file and select Open With
  2. Select Notepad.
  3. Select File | Save As.
  4. Set the Encoding to ANSI and click Save.
Enjoy!
Filed under: , , , , , ,

Comments

# Reading Fixed Length Files: TextFieldParser

Tuesday, August 25, 2009 5:05 PM by Deborah's Developer MindScape

In my prior post, I covered how to read a fixed length file into an in-memory DataTable. You could then

# Formatting Text Files

Tuesday, August 25, 2009 5:08 PM by Deborah's Developer MindScape

There are often times that you need to write out text files containing data managed by your application

# re: Reading Fixed Length Files

Thursday, September 10, 2009 3:30 PM by Chris

Trying this on a fixed width file, but it doesn't work unless I have an end of line character.  The file I am working with is just one long stream of characters.  Thanks anyway!

# re: Reading Fixed Length Files

Friday, October 16, 2009 10:46 AM by Sri

Hi Deborah

I liked the simplicity of your articles. I had a task to read a tab delimited text file into dataset and was using this code. But my limitation is that I can't create any file on the server i.e. schema.ini. Is there a way to use oledb to readt tab delimited file without using schema.ini file?

# re: Reading Fixed Length Files

Friday, October 16, 2009 11:28 AM by Deborah Kurata

Hi Sri -

Did you see this blog post?

msmvps.com/.../reading-comma-delimited-files-textfieldparser.aspx

Hope this helps.

# re: Reading Fixed Length Files

Friday, October 16, 2009 12:31 PM by Lon Feuerhelm

Deborah,

Great post. This is probably a matter of personal taste but if I were coding this I would define the Customer ID as 8 and the Last Name as 20, I prefer to deal with trailing spaces I've always felt leading spaces are more problematic to deal with when processing data.

# re: Reading Fixed Length Files

Monday, October 19, 2009 9:20 AM by Sri

Thanks Deborah, I read the suggested post and found that it would be easier to use Schema.ini file. So now I got access to temporarly create schema.ini file and export the tab delimited file to dataset.

This was working great with test data but when I tried with production data. It failed because of datatype issue. Some of the rows have numeric data and some has text in one of the column. So I modified the Schama.ini file to add "Col3=C Text Width 100" but unfortunately I am still getting the same datatype error. I am not sure if Schema.ini is bein read or not? Appreciate your suggestions.

# re: Reading Fixed Length Files

Tuesday, November 17, 2009 2:05 PM by vishal

my text file adds and subtracts the text in my application but it doesnt show the text inside.

could you please let me know why is that so?

# re: Reading Fixed Length Files

Tuesday, November 17, 2009 3:13 PM by Deborah Kurata

Hi vishal -

Thank you for stopping by my blog. Please post your question here:

social.msdn.microsoft.com/.../categories

The forums provide a much easier place for asking questions that require reviewing code and submitting follow up questions.

I monitor the forums often, and there are many experts there that can help you with any issues you are having.

Hope this helps.

# re: Reading Fixed Length Files

Saturday, May 22, 2010 8:03 AM by staples

Hi, is it possible to read fixed length files without creating the schema.ini? I so, how will it be? thanks.

# re: Reading Fixed Length Files

Monday, May 24, 2010 10:49 AM by Deborah Kurata

Hi Staples -

There are many ways to read fixed length files. Another way is defined here:

msmvps.com/.../reading-fixed-length-files-textfieldparser.aspx

Hope this helps.

# Fixed Rate ISA

Saturday, June 12, 2010 3:28 AM by Johnpeter

Hi Deborah

I liked the simplicity of your articles.

I read the suggested post and found that it would be easier to use Schema.ini file. So now I got access to temporarly create schema.ini file and export the tab delimited file to dataset.

Thank you....

______________________________________________

Johnpeter

# re: Reading Fixed Length Files

Saturday, March 05, 2011 5:25 PM by Enrique

Hi Deborah !

Thank you for taking the time and post this excellent article.

My question is: since your are using Provider=Microsoft.Jet.OleDb.4.0, do I need to have Access installed on the client pc?

Thanks a bunch !

# re: Reading Fixed Length Files

Saturday, March 05, 2011 6:35 PM by Deborah Kurata

Hi Enrique -

You can include the dlls with your application. See this link for the redistribution:

www.microsoft.com/.../details.aspx

Hope this helps.

# re: Reading Fixed Length Files

Friday, March 30, 2012 6:15 AM by Niral Patel

Hi,

this seems so useful for my next task,

i just want to know that,

how can i convert this Data Table to a XML file.

Please let me know.

Thanks in advance.

Leave a Comment

(required) 
(required) 
(optional)
(required) 
If you can't read this number refresh your screen
Enter the numbers above: