
Test Automation for PDF Files
For years, the automated verification of PDFs was incredibly challenging, if not impossible. Because of this, teams would automate their UI tests but would skip the part where they verify that their PDF artifacts were accurate. This then became the boring, mundane, error-prone task left for the testers to repeat release after release.
Since then, visual validation tools such as Applitools Eyes have hit the scene making the automated regression testing of the look and feel of an application possible. A common question I receive is “does Applitools work for PDFs?”. I knew the answer was yes, but I decided to actually give it a try myself to see exactly how it works.
There’s actually two ways to invoke PDF test automation. The one described in the tool’s tutorial page shows how to execute the PDF validation via the command line. However, as an automation engineer, I wondered if it was possible to do this via my existing automation framework. Since the magic happens via a Java CLI command, I was pretty sure it should also work from my code. But I wanted to try it out just to be sure. It worked!
I’ll detail both approaches.
Command Line Interface (CLI)
Applitools provides an executable, ImageTester.jar, which is a tool that verifies stand-alone images and also PDF files.
It’s pretty straightforward to use. You put your PDF files in a directory and run a command from your terminal to have this tool verify all files within that directly. Alternatively, you can specify an individual PDF filename and it will only verify that particular file.
1 |
java -jar ImageTester.jar -k $APPLITOOLS_API_KEY -f <PATH>/pdf_directory/ |
The -k argument is your API key which you can obtain by opening a free account. And the -f argument is the path to the directory or file that you want verified.
I moved the ImageTester.jar into a directory and also added another directory there called Invoice_PDFs where I stored this PDF file. I then ran the command and voila, the test was executed!
The first time I ran this, a baseline was saved, and then every time this was run again, the PDF was compared against the baseline. If anything changed on the PDFs, we’d get an error message on the console and link to review the differences in the Applitools dashboard.
1 2 3 4 |
java -jar ImageTester.jar -k $APPLITOOLS_API_KEY -f Invoice_PDFs Batch: Invoice_PDFs [Mismatch] - INV12345.pdf + Result url: https://eyes.applitools.com/app/batches |
Code
The CLI approach is cool, but I got to thinking about how I would want to use this as an automation engineer. It would be in the midst of an automated scenario where I’ve taken action on the UI, am downloading the resulting PDF, and now want to verify it.
So, I wrote an automated test for Invoice Simple that uses the UI to create a new invoice, then downloads a PDF of that invoice and then uses the ImageTester to verify the PDF.
After writing all the UI code, I needed to add the following as well:
- Code to move the PDF file from my computer’s default download directory to the directory I store the PDF files that I want verified.
123File downloadedPDF = new File("/Users/angie/Downloads/" + invoiceNumber + ".pdf");String destination = "resources/Invoice_PDFs/" + invoiceNumber + ".pdf";FileUtils.moveFile(downloadedPDF, destination);123456789public static boolean moveFile(File file, String destination){File existingFile = new File(destination);if(existingFile.exists()){existingFile.delete();}return file.renameTo(new File(destination));} - Execute the ImageTester.jar command. I wrote a utility method so that I could reuse it from any test.
1234567891011121314151617public static boolean validatePDF(String filepath) throws IOException, InterruptedException {String command = String.format("java -jar resources/ImageTester.jar -k %s -f %s",System.getProperty("applitools.api.key"),filepath);Process process = Runtime.getRuntime().exec(command);process.waitFor();String stream = IOUtils.toString(process.getInputStream(), "UTF-8");System.out.println(stream);if(stream != null && stream.contains("Mismatch")){return false;}return true;}1Assert.assertTrue("Error validating PDF", validatePDF(destination)); - There is a date on the PDF file indicating when this file was generated. Well, that date will change each day that this test runs. Fortunately, Applitools has a way to ignore certain regions of the PDF file. After my initial test ran and the baseline was captured, I was able to go to the dashboard and specify to ignore the date area.
I really like how flexible this tool is. There’s a host of other arguments you can use as well.
This all worked like a charm and was much simpler than I anticipated. You can find all of my code for this automated test for PDF files on my Github.
Jim Hazen
You’re right Angie, a PDF file has been one of the toughest things to “test” with automation (and even by human means). There have been tools (with a CLI) in the past that will do comparison of two files and give a Pass/Fail error messages. But they are prone to false negative/positives as you point out due to date/timestamps and other things that are “sensitive data” that cause flakiness. Nice to see that Applitools Eyes can do a “reverse mask” of the region to remove it from the comparison. An old technique that has found new life once again. Also there is the risk of potential differences at the pixel level. Fortunately with some fuzzing logic that can be tuned to allow for useful partial matches within tolerance ranges. As we both know using Image comparison can be tricky, but with today’s technology (in comparison to 25 years ago) there can be uses for it that have benefit.
Angie Jones
yeah, Applitools doesn’t use pixel to pixel comparison fortunately, so those false negatives aren’t a prevalent issue. Image comparison has certainly come a long way. 🙂
Puneet Bisht
Thanks Angie for the post , for sure the PDF Automation was the most critical/challenging/ROI automation for our organization . In addition to verification of the PDF content there were few other areas in the ecosystem of PDF Automation , need the best practices to make the tool efficient(if using open source frameworks) :1) How to generate the new PDF efficiently (we use Karate, previously using our own in-house Java API framework )2) https://github.com/red6/pdfcompare for PDF Compare (Free and serves our need)3)Reporting – we use Cucumber report and attach the pdf-Diff along with the credentials to replicate the issue4) Manage the Baselines
Angie Jones
I just read another post today about using Rest-Assured for downloading files. Pretty cool
Maik Toepfer
Thanks for pointing me to pdfcompare. It follows a rather traditional, non-clound based approach but it looks like a very good 80% solution.
Andy
Gave this a whirl, as why not…looks like a great little tool. Sadly the corporate firewall didn’t like the direct connection of the CLI. I think I will need to give it another go from a scripted solution so that I can enforce a connection through a proxy.
Musaffir
Thanks for the post AngieI work for a company which makes payroll software’s, so we produce a lot of PDFs from the app. Like payslips , year end reports etc. So far we haven’t considered PDF automation mainly due to all the challenges you have described. I explored applitools eyes and I found a real value in it. We just have to make a budget and purchase applitools eyes licence to get all these benefits 🙂
Pingback: Assumptions of the Test Pyramid | Complexity is a Matter of Perspective
Hiromi
This is a fantastic tool. Our company has OCR and image processing product and I was looking for the way to automate the validation. Thank you, Angine for the post!
Sahil Thareja
Nicely presented! I have one question here we need to have paid membership of applitools in order to perform this validation am i right?
Angie Jones
There’s a free account as well
Ramkumar Gour
Hi Angie, Thanks for this blog. This tool looks promising. One question – do we need to set the baseline always which mean we need to trigger java command with args two times ? Also, you said first time also it does some validations, what does it test when it doesnt have any thing to compare to ?