
Using Apache POI to Read Excel Files
Here is a recipe for programmatically reading data from an Excel spreadsheet using Apache POI. Note that in this example, I’m using the spreadsheet to store test data. However, this is NOT an endorsement to manage your test data this way. This is just one practical example of why you may want to programmatically read from a spreadsheet.
Recipe to Programmatically Read from Excel Spreadsheets
Ingredients
- Apache POI
- Excel spreadsheet
Instructions
- Open spreadsheet
- Switch to worksheet you want to read from
- Read header row to identify columns
- Get data from a specific cell
Live Stream Video
To demonstrate this recipe in the context of test automation, we’ll use Excel to store test data, read the data from the spreadsheet using Apache POI, then supply the data to Selenium WebDriver for scenario execution.
Ingredients
1 Apache POI
We’ll need both the poi and poi-ooxml dependencies from mvnrepository. This is the library that enables us to read from the Excel spreadsheet.
1 2 3 4 5 6 7 8 9 10 11 |
<dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>4.1.2</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>4.1.2</version> </dependency> |
2 Excel Spreadsheet
Inside of the Excel spreadsheet, I’ll make three worksheets: the first and last being blank, and the second one containing test data for this scenario. The purpose of having multiple worksheets is simply to demonstrate how to switch to different sheets within our code.
Instructions
1 Open Spreadsheet
In my web scenario, I want to find a specific transaction.
To do so, I need to enter the account ID and the transaction ID. That data lives in our spreadsheet, so let’s open the file. To do so, we create a File object and pass it to WorkbookFactory.create(). The create() method throws an exception so we need to use a try/catch block.
1 2 3 4 5 6 7 |
File file = new File("resources/Test Data.xlsx"); try(var workbooks = WorkbookFactory.create(file)){ }catch(Exception e){ e.printStackTrace(); } |
Side note: Here I’m using the try with resources feature in Java. This is a nice, streamlined way to work with AutoCloseable resources. Learn more in my chapter on Exceptions in my free Java course.
2 Switch to Worksheet
Now that we’re in the worksheet, we need to go to the tab that we want to read from. For this scenario, we’re working with the user john, so we go to the “john” tab which is the second worksheet. To go to a specific worksheet, call the getSheet() method.
1 2 3 4 5 |
try(var workbooks = WorkbookFactory.create(spreadsheet)){ currentSheet = workbooks.getSheet("john"); }catch(Exception e){ e.printStackTrace(); } |
3 Read Header Row
I strongly recommend using the first row of your worksheet to name each of the columns. This will allow you to read the data more reliably, even if the order of the columns switch later on. Notice here on row 1, we have labeled each column. So as opposed to randomly trying to read cell A2 with the assumption that it’s an account ID, we can instead go to the Account column and then read the data.
To keep track of the columns, I’m going to read them in by calling getRow(0) to go to the first row [note: indices start at 0], and getting each cell in row 1 and storing the cell contents into a Map. On line 7 of the code below, I get the data in the cell by calling getStringCellValue(). Note that this works because the data in the header row are all Strings. We’ll see how to handle other data types in a bit.
1 2 3 4 5 6 7 8 9 10 11 |
Map<String, Integer> columns; Sheet currentSheet; try(var workbooks = WorkbookFactory.create(spreadsheet)){ currentSheet = workbooks.getSheet("john"); currentSheet.getRow(0).forEach(cell ->{ columns.put(cell.getStringCellValue(), cell.getColumnIndex()); }); }catch(Exception e){ e.printStackTrace(); } |
4 Get Data from Cell
Ok, remember, our goal here is to get the account ID and the transaction ID to pass into this web form. So, we need to identify which row we want to read from, and then get the data in the Account ID and Transaction ID columns.
Here I indicate that I’d like to read from the second row (index 1), and I specify the column I’d like to read from.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
int row = 1; String accountId = getCellData("Account ID", row); String transactionId = getCellData("Transaction ID", row); public String getCellData(String column, int row){ var dataRow = currentSheet.getRow(row); return getCellDataAsString(dataRow.getCell(columns.get(column))); } public String getCellDataAsString(Cell cell){ return switch(cell.getCellType()){ case STRING -> cell.getStringCellValue(); case NUMERIC -> String.valueOf((int)cell.getNumericCellValue()); default -> ""; }; } |
The getCellDataAsString() method here demonstrates how to read different data types from the spreadsheet. This is my least favorite thing about Apache POI because I have to account for all of the various data types, when I don’t really care about.
I’d like everything to be a String for my web automation purposes. So, this method uses the appropriate POI method to read the data but then turns it into a String.
Note: getNumericCellValue() on line 13, returns a double even if the number in the spreadsheet is an integer. This is to be able to handle any type of numeric value. But this poses a problem for us as the account ID is being returned as 12345.0. So, in the code, I cast it to an integer to remove the decimal place. Only do this if you’re sure the cell doesn’t really contain decimal places.
13 |
case NUMERIC -> String.valueOf((int)cell.getNumericCellValue()); |
After entering the account ID and transaction ID into the web form using Selenium, we’re taken to this page which shows us the transaction details.
We can use the same approach to verify the data on the UI with what’s in the spreadsheet.
1 2 3 4 5 |
var results = transactionPage.getTransactionResults(); assertEquals(getCellData("Date", row), results.get(0)); assertEquals(getCellData("Description", row), results.get(1)); assertEquals(getCellData("Credit", row), results.get(3)); |
Bonus
I’d like to be able to reuse these POI methods for other tests without having to recreate this logic every time. So, I’m going to store the methods in a utility class.
SpreadsheetUtil.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
package utils; import org.apache.poi.ss.usermodel.Cell; import org.apache.poi.ss.usermodel.Sheet; import org.apache.poi.ss.usermodel.WorkbookFactory; import java.io.File; import java.util.HashMap; import java.util.Map; public class SpreadsheetUtil { private File spreadsheet; private Sheet currentSheet; private Map<String, Integer> columns; public SpreadsheetUtil(File file){ spreadsheet = file; columns = new HashMap(); } public void switchToSheet(String name){ try(var workbooks = WorkbookFactory.create(spreadsheet)){ currentSheet = workbooks.getSheet(name); currentSheet.getRow(0).forEach(cell ->{ columns.put(cell.getStringCellValue(), cell.getColumnIndex()); }); }catch(Exception e){ e.printStackTrace(); } } public String getCellData(String column, int row){ var dataRow = currentSheet.getRow(row); return getCellDataAsString(dataRow.getCell(columns.get(column))); } private String getCellDataAsString(Cell cell){ return switch(cell.getCellType()){ case STRING -> cell.getStringCellValue(); case NUMERIC -> String.valueOf((int)cell.getNumericCellValue()); default -> ""; }; } } |
Test
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
package files; import base.BaseTests; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; import parabank.pages.FindTransactionsPage; import parabank.pages.Page; import utils.SpreadsheetUtil; import java.io.File; import static org.junit.jupiter.api.Assertions.assertEquals; public class FileTests extends BaseTests { @BeforeAll public static void launchApp(){ driver.get("https://parabank.parasoft.com/parabank/index.htm"); } @Test public void checkTransaction(){ String user = "john", password = "demo"; String fileName = "resources/Test Data.xlsx"; var page = new Page(driver); page.login(user, password); page.clickMenuLink("Find Transactions"); var spreadsheet = new SpreadsheetUtil(new File(fileName)); spreadsheet.switchToSheet(user); int row = 1; String accountId = spreadsheet.getCellData("Account ID", row); String transactionId = spreadsheet.getCellData("Transaction ID", row); var findTransactionPage = new FindTransactionsPage(driver); var transactionPage = findTransactionPage.searchByTransactionId(accountId, transactionId); var results = transactionPage.getTransactionResults(); assertEquals(spreadsheet.getCellData("Date", row), results.get(0)); assertEquals(spreadsheet.getCellData("Description", row), results.get(1)); assertEquals(spreadsheet.getCellData("Credit", row), results.get(3)); } } |
Nevena
Angie Jones
I don’t recommend catching exceptions in page object classes. let the test handle it.
Nevena
Hemant Varhekar
I am getting an error Switch stamens. It’s throwing switch statement is not supported in java version 12 and above. Not sure whether this is an issue with java version 13.02 which I have. But when I switched to old “Switch ” block it works
Angie Jones
You have to select the option to enable previews (or use Java 14)
Daniel