Reading Comma Delimited Files: TextFieldParser
Posted
Tue, Aug 25 2009 14:58
by
Deborah Kurata
In my prior post, I covered how to read a comma delimited file into an in-memory DataTable. You could then work with the DataTable as desired to access the fields from the file. You could even bind the resulting DataTable to a grid or other control.
But sometimes you just need to process the file line-by-line and need a simpler solution. That is where the TextFieldParser class comes in.
The TextFieldParser works with any file extension and with any character set, so you can use it with UTF-8 or ANSI text files.
For this example, the text file is as follows:
1, Baggins, Bilbo, 20090811
2, Baggins, Frodo, 20090801
3, Gamgee, Samwise, 20090820
4, Cotton, Rosie, 20090821
NOTE: Be sure to set a reference to Microsoft.VisualBasic.FileIO
Since this class is part of the Visual Basic namespace, only the VB code is presented here. However, you can import this namespace in a C# program and use it.
Dim fileName As String = "testCSV.txt"
Dim dirName As String = _
Path.GetDirectoryName(Application.ExecutablePath)
Using tf As New TextFieldParser _
(Path.Combine(dirName, fileName))
tf.TextFieldType = FileIO.FieldType.Delimited
tf.SetDelimiters(",")
Dim row As String()
While Not tf.EndOfData
Try
row = tf.ReadFields()
For Each field As String In row
' Do whatever to the set of fields
Debug.WriteLine(field)
Next
Catch ex As MalformedLineException
MessageBox.Show("Line " & ex.Message & _
"is not valid and will be skipped.")
End Try
End While
End Using
This code first declares variables to hold the text file name and the directory name. The file can reside in any directory that the user can access. In this example, the file resides in the same directory where the application is executed. But this is not a requirement.
The first using statement in the example code creates an instance of the TextFieldParser class. The parameter to the constructor defines the directory and filename of the text file. The Path.Combine method is used to ensure that the correct slashes are added to the end of the directory name as required.
The code then sets two additional properties. The first is TextFieldType. Since this is a delimited file, the Delimited type is specified. The second is SetDelimiters. This defines the delimiter used for the file. This example uses a comma separated value (CSV) file, so a comma is specified.
The While loop processes each line of the file. The ReadFields method reads each row of the file into a string array. If there is a problem reading the line, the Try/Catch block catches the error, displays a message and continues. (However, in a real application you would want to log this information instead of displaying it to the user.)
The For/Next statement loops through the fields in the row and displays them. This is where you would add the code that processes the values from the text file.
Notice that as you go through each loop, you replace the row variable with the field from the next line. So after a row is processed, you cannot access the data from that row again. (If you do need more random access to the data, you can store each row in a list or other structure, or you can read the data into a DataTable using the technique detailed here.)
The resulting information in the Debug window is as follows:
1
Baggins
Bilbo
20090811
2
Baggins
Frodo
20090801
3
Gamgee
Samwise
20090820
4
Cotton
Rosie
20090821
Enjoy!