A while ago I blogged about a solution to using the Microsoft.Jet.OLEDB.4.0 provider on x64 systems. The goal of using Microsoft.Jet.OLEDB.4.0 was to connect to and read data out of user-loved .XLS files (Microsoft Excel <= 2003).
Jumping forward in technology (Microsoft Excel >= 2007) we now have to deal with reading .XLSX files (Office Open XML). These files follow the Office Open XML standard and are meant to be easier to read. One would think that since they are just zip files containing the guts of the document they would be easy (and possibly fun) to read, based to my experience so far, not so much.
Enter EPPlus a helpful “.net library that reads and writes Excel 2007 files using the Office Open Xml format (xlsx)”. With minimal tinkering I was able to read the .xlsx document, that my users wanted to specify the data in, process it and save it to a database.
The EPPlus project is hosted on CodePlex and seems to be well supported.
Here is how you can use the EPPlus library:
1. You will need to add a reference to EPPlus dll (download it from the CodePlex site).
2. You will need to reference OfficeOpenXml (A namespace in the EPPlus dll).
3. Then simply read the file and work through the workbook and worksheets
// Get the file we are going to process
var existingFile = new FileInfo(filePath);
// Open and read the XlSX file.
using (var package = new ExcelPackage(existingFile))
// Get the work book in the file
ExcelWorkbook workBook = package.Workbook;
if (workBook != null)
if (workBook.Worksheets.Count > 0)
// Get the first worksheet
ExcelWorksheet currentWorksheet = workBook.Worksheets.First();
// read some data
object col1Header = currentWorksheet.Cells[0, 1].Value;
Should you feel a need to see some code in action take a look at the quick example I threw together, you can download it from here.
For those that have got this far and are interested in where the data comes from... head over the most excellent XKCD and do no work for the rest of the day. :)