Fast reading of Excel worksheets into a .NET program

Reading data from Excel into a C# or VB.NET program using Interop.Excel.dll can be very slow because each read operation goes through a slow path in Windows (about 50ms), so instead of looping through each Range or Cell object to read one cell at a time, it is much better to read the entire Range (multiple cells) of interest in one go, and then do the looping on an array inside the program, without having to pull data in from Excel on each iteration.

First get the dimensions of the sheet you wish to process. Here is a handy function to do that:

 ''' <summary>
 ''' Gets dimensions of Excel worksheet, assuming the first used cell is at (1,1).
 ''' </summary>
 ''' <param name="w">Worksheet object</param>
 ''' <param name="numrows">Returns the number of rows found to be used in the sheet</param>
 ''' <param name="numcols">Returns the number of columns found to be used in the sheet</param>
 ''' <remarks>return values are 1 based. Works even when there are blank rows or columns which appear to split the sheet.</remarks>
 Private Sub GetDimensionsOfWorksheet(ByVal w As Excel.Worksheet, ByRef numrows As Long, ByRef numcols As Long)
 'helpful hint for this at https://stackoverflow.com/questions/10752971/first-blank-row-in-excel/10753301#10753301
     numcols = DirectCast(w.Cells(1, w.Columns.Count), Excel.Range).End(Excel.XlDirection.xlToLeft).Column
     numrows = DirectCast(w.Cells(w.Rows.Count, 1), Excel.Range).End(Excel.XlDirection.xlUp).Row
 End Sub

Then use this to read in the range of values:

Dim lastcol As Long
Dim lastrow As Long
GetDimensionsOfWorksheet(worksheet, lastrow, lastcol)
Dim range As Excel.Range = worksheet.Range(worksheet.Cells(1, 1), worksheet.Cells(lastrow, lastcol))
Dim cellsStr(,) As Object = range.Value2

You might need to use .Value instead of .Value2 for cellsStr().

Then you can loop from 1 to lastrow and 1 to lastcol as array indexes of cellsStr(). For example

For r As Long = 1 To lastrow
    For c As Long = 1 To lastcol
        Debug.Print("Row " & r & " Col " & c & " Value=" & CStr(cellsStr(r, c)))
    Next c
Next r

My measurements are an improvement from 9.6ms per cell to 1.5ms per cell based on a 5 worksheet, 650 used cells spreadsheet.

Thanks to https://stackoverflow.com/a/39898500/8157 for this technique.

This article was updated on February 14, 2020