Transferring data around with ADO.NET 2.0 - SqlBulkCopy

Published Sat, Aug 26 2006 17:03 | William

I'm working at a client on a data migration and found some results that were surprising. I expected there to be a difference but not one quite this large...

Synopsis of Problem:

Right now I'm working on a rather complex migration from a bunch of disparate data sources into a consolidated Sql Server database.  Phase 1 simply involves getting the data into tables which mimick the source. Phase 2 involves normalizing and moving the data over.  Now that the data has been moved over, ongoing weekly data needs to be imported from various Excel sheets into the database.

Approach:

At first DTS was used.  It's a Sql Server 2000 install for the time being so SSIS isn't an option. DTS worked ok but there was one really big problem. The end user who will handle the migration is only familiar with Access and doesn't yet know their way around Sql Server's tools.  We couldn't just automatically schedule the packages to run b/c there is not a specific interval when the sheets will arrive and they may come from multiple places.  DTS would do the job but because of coupling issues, it wasn't a great solution b/c it involved way too much transition work.

So I created a Winforms app that let the user point to each sheet and just hit import.  Using an OleDbDataAdapter, I'd connect to the Excel sheet, set AcceptChangesDuringFill to false on the adapter and just call Fill.  From there I'd configure and adapter pointing to the Sql Server DB (it only needed an Insert command) and just pass the datatable filled from Excel to the Update method.  Easy enough.   But, it was SLOOOOOWWWW.  There would often be upwards of 20,000 records to import. Moreoever, just about every field was VARCHAR (255) which made the record huge.  I am fully aware that the VARCHAR issue needs to be resolved but for the time being, it's a contraint we just have to deal with.  Anyway, it could take hours to do the updates. Hours.

In comes SqlBulkCopy. It's easier than the data adapter approach. It's faster. And it's less code.

So the only real challenge if you want to call it that was writing out the column mappings.  After that, you just need to declare a SqlBulkCopy object, set the column mappings. give it  a connection string, specify a Destination table and just call the WriteToServer method. 

Results:

With one exception, the writes happen in under a second. They are so fast that I had to wonder if they actually worked.  They did.  With respect to the one sheet that has 20,000+ rows, it takes about 7 seconds from start to finish.  7 Seconds!  Compared to two+ hours.

Needless to say this was almost the text book use case for SqlBulkCopy.  The dataadapter is a wonderful and powerful object, but it's nowhere near as efficient.  In the 2.0 framework you can specify the batchUpdateSize property which can minimize round trips, but that only goes so far.  I shaved a few minutes off with it but it was still marginal.  In cases where you need to simply move data to a Sql Server database, I'd highly encourage starting out with the SqlBulkCopy b/c in short, it just flat out rocks.

Comments

# Miha Markic said on August 28, 2006 10:04 AM:

That's why SqlBulkCopy is there :-). Seriously, those numbers you found are interesting.

# Angel said on August 29, 2006 5:56 PM:

Glad to hear this! I remember the first time I helped a customer (internal to Microsoft) move his application from traditional inserts to a SqlBulkCopy solution.
A six hour process that he had been cursing at for arround three years started running in under three minutes. He had tears in his eyes, not making this up.

# Brian Madsen said on August 30, 2006 12:41 AM:

that's an excellent way to apply a technology correctly.

i've been using SqlBulkCopy for some time but have yet to measure the performance (mostly because they run at night and is less than 3hrs anyways)...

should probably check out how much of a performance gain we've gotten by using it...

# William said on September 1, 2006 8:58 PM:

Our blog is getting killed by spam comments and it's taking me forever to clean up - I have the actual results though from repeated runs and man, few features deliver more than SqlBulkCopy. It's honestly amazing.

Search

This Blog

Tags

Community

Archives

News

My other sites

Cool Stuff

Book Stuff

Security

ORM

Data Access

Funny Stuff

Compact Framework Stuff

Web Casts

My KnowledgeBase Articles

My MVP Profile

Design Patterns

Performance

Debugging

Remoting

My Fellow Authors

My Books

LINQ

Misc

Speech

Syndication

Email Notifications