Today we needed to insert a large amount (250,000 rows) of automatically generated data into a SQL Server database. Generating the data was very fast, but inserting each line with a SQL Insert statement, painfully slow, even if logging is in Simple Mode in the SQL database.
So… we investigated the SQLBulkCopy function in .NET, but this is usually used to load the data from a text file, or from another table. How to do it from data which we have in memory?
This is the table we’re trying to fill (SQL Server code to make the table)
CREATE TABLE [dbo].[tblDiaryEntries]( [diary_id] [int] IDENTITY(1,1) NOT NULL, [person_id] [int] NULL, [datewhen] [date] NULL, [what] [varchar](50) NULL, [is_live] [int] NOT NULL, [why] [varchar](50) NULL, CONSTRAINT [PK_tblDiaryEntries] PRIMARY KEY CLUSTERED ( [diary_id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO
Try this at the top in your imports section (this is VB.net code)
Imports System.Data.SqlClient
… and in your code, this to build a in memory data table from the source data (in our case, a hashtable of hashtables!)
Dim dtable As DataTable = New DataTable("tmpforinsert") Dim column As DataColumn column = New DataColumn() column.DataType = System.Type.GetType("System.Int32") column.ColumnName = "diary_id" dtable.Columns.Add(column) column = New DataColumn() column.DataType = System.Type.GetType("System.Int32") column.ColumnName = "person_id" dtable.Columns.Add(column) column = New DataColumn() column.DataType = System.Type.GetType("System.DateTime") column.ColumnName = "datewhen" dtable.Columns.Add(column) column = New DataColumn() column.DataType = System.Type.GetType("System.String") column.ColumnName = "what" dtable.Columns.Add(column) column = New DataColumn() column.DataType = System.Type.GetType("System.Int32") column.ColumnName = "is_live" dtable.Columns.Add(column) column = New DataColumn() column.DataType = System.Type.GetType("System.String") column.ColumnName = "why" dtable.Columns.Add(column) For Each who As Integer In diary.Keys Dim thisdiaryentry As Hashtable = diary(who) If Not thisdiaryentry Is Nothing Then For Each _when As Date In thisdiaryentry.Keys Dim thiscc As CodeInfo = thisdiaryentry(_when) Dim dr As DataRow = dtable.NewRow() dr.SetField("person_id", who) dr.SetField("datewhen", _when) dr.SetField("what", thiscc.Code) dr.SetField("why", thiscc.Info) dr.SetField("is_live", who) dtable.Rows.Add(dr) Next End If Next
and then this to actually do the insert.
Dim bulkinsert As New SqlBulkCopy(config.sql_connection) bulkinsert.DestinationTableName = "tbldiaryentries" bulkinsert.WriteToServer(dtable)
That’s rather quicker, isn’t it? For us, it takes something that took 30 mins to run, and made it subsecond.