MSMVPS.COM
The Ultimate Destination for Blogs by Current and Former Microsoft Most Valuable Professionals.

SSIS Issue with Excel source

I've been doing a bit of SSIS (the data transformation technology formally know as DTS), and came across a nasty issue with using an Excel source to import data.  The problem may not actually lay with SSIS - the Excel Source in SSIS uses Jet under the hood, and the problem could rest entirely with Jet.

The problem occurs where you have a sparsely populated column - SSIS/Jet will only sample a certain number of rows (I think it is configurable in the registry somewhere) to determine the type of the data held in the column.  Say, for arguments sake, only 20 rows are sampled and no data is found.  The type of the data will default to a string.  If row 21 contains a numeric type or a date, this row will be silently ignored and treated as a null.  This is a pretty crappy default behaviour - silently ignoring data that you are spending a lot of time (and clients money) to import.  You have been warned!

Update: The registry key name is Hkey_Local_Machine/Software/Microsoft/Jet/4.0/Engines/Excel/TypeGuessRows, and has a default value of 8.  I've increased this value in my registry significantly.


Posted Mar 15 2006, 08:39 AM by nick
Filed under:

Comments

David Conti wrote re: SSIS Issue with Excel source
on 03-15-2006 4:49
I'd agree that its with Jet, seen the exact same issue with excel imports (before SSIS was even available)
Elaine van Bergen wrote re: SSIS Issue with Excel source
on 03-15-2006 15:46
I ran across a simlar problem using DTS in sql 2000.
It was related to the TypeGuessRows setting of the jet provider as detailed in this article.
http://support.microsoft.com/kb/281517

I agree it is pretty crappy behaviour, even if you know the fix it can be a bit tricky to explain why you would need to poke around in the registry on a Prod server.

In the end we used a manual work around instead which involved putting dummy data in the first row of the spreadsheet template that was hidden then ignoring it later in the import process.

SM wrote re: SSIS Issue with Excel source
on 03-15-2006 23:42
Nick,

I had a similar problem in situation where I was reading a excel file with in .Net, to overcome this I had to add a extra driver attributes 'IMEX'.
When using the Jet engine you can actually ensure the engine does not assume any data type.

See the small sample below:
private DataTable LoadDataTable(string excelFileName)
{
string excelCon = String.Format(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=""Excel 8.0; IMEX=1""", excelFileName);

using (DataTable dataTable = new DataTable("PointsTransfer"))
{
using (OleDbConnection dbConnection = new OleDbConnection(excelCon))
{
using (OleDbDataAdapter oleDbDataAdapter = new OleDbDataAdapter("SELECT * FROM [sheet1", dbConnection))
{
oleDbDataAdapter.Fill(dataTable);
}
}
return dataTable;

}
}
Hope this helps

Cheers
S.M


Copyright © is the original authors. Blog site is an independent site not sponsored by Microsoft. The Yoda blog server and the Brianna SQL server would like to thank www.ownwebnow.com and www.exchangedefender.com. They wouldn't be here and broadcasting without the generosity of Vlad Mazek and his companies.

Powered by Community Server (Commercial Edition), by Telligent Systems