Blog

Home /
Blog /
Approval Tests For PDF Document Generation

Approval Tests For PDF Document Generation

October 7, 2021

A while back I was confronted with a part of a legacy system that generates PDF documents. This legacy system used a well known library for generating the requested PDF files. The good news was that there were a decent amount of tests available. The not so good news was that these tests made heavy use of test doubles for swapping out most of the types provided by the third-party API. I strongly believe that using test doubles in such cases is not a good choice. In the past I already wrote about why to avoid using test doubles for types that you don’t own.

Tests like these might become a large impediment whenever we upgrade to a newer version of the third-party library. Major versions quite often introduce breaking changes to existing API’s. Whenever tests are strongly coupled to such an API, they basically need to be rewritten during the upgrade.

In order to reduce the coupling of these tests, I decided to replace them with Approval Tests instead. This technique is also known as “Golden Master” or “Characterization Test”. The idea behind an Approval Test is that after the first test run, some output needs to be visually verified and approved. During subsequent test runs, the approved output will be compared to current output. When there’s a difference in the output, the test will fail.

This is especially useful whenever we have to deal with code of a legacy system. You can check out this video from Emily Bache where she demonstrates the Gilded Rose refactoring kata using Approval Tests . Highly recommended!

Usually the output of Approval Tests is captured in plain text files, which has nothing to do with PDF files. So I decided to extend the Java Version of an Approval Test library to support PDF documents as well. Let’s have a look at some example code to demonstrate this extension.

Suppose that we have a small application that generates a PDF document. A generated document contains the refrain of a well-known song lyric. The following code shows a possible implementation.

public class SingAlongPdfGenerator {

    public ByteArrayOutputStream generate(SingAlongData pdfData) {

        var outputStream = new ByteArrayOutputStream();
        PdfDocument pdf = new PdfDocument(new PdfWriter(outputStream));

        Document document = new Document(pdf);
        document.add(new Paragraph("Let's sing-a-long:"));

        List list = new List()
            .setSymbolIndent(12)
            .setListSymbol("\u2022");

        for(var refrainLine : pdfData.getRefrainLines()) {
           list.add(new ListItem(refrainLine));
        }

        document.add(list);
        document.close();

        return outputStream;
    }
}

public class SingAlongData {

    private final List<String> refrainLines;

    public SingAlongData(List<String> refrainLines) {

        this.refrainLines = refrainLines;
    }

    public List<String> getRefrainLines() {
        return refrainLines;
    }
}

The SingAlongPdfGenerator class provides a method named generate that accepts a SingAlongData instance as its only parameter. The SingAlongData class is merely a DTO that provides a list of refrain lines. The generate method creates a new document containing a paragraph of text and a list of the refrain lines.

Let’s have a look at the code of the corresponding Approval Test.

public class SingAlongPdfGeneratorTests {
    
    @Test
    public void generateRickRollPdf() {

        var refrainLines = Arrays.asList(
            "Never gonna give you up",
            "Never gonna let you down",
            "Never gonna run around and desert you",
            "Never gonna make you cry",
            "Never gonna say goodbye",
            "Never gonna tell a lie and hurt you"
        );
        var data = new SingAlongData(refrainLines);

        var pdfGenerator = new SingAlongPdfGenerator();
        var result = pdfGenerator.generate(data);

        PdfApprovals.verify(result);
    } 
}

First we create an instance of the SingAlongData DTO, specifying some test data. Next we create an instance of the Subject Under Test, which in this case is the SingAlongPdfGenerator class and call the generate method. This returns a ByteArrayOutputStream containing the data of the PDF document. Then we verify the result by calling the PdfApprovals.verify method. Notice that the anatomy of an Approval Test is identical to any other type of test as it also adheres to the Arrange, Act, Assert pattern.

A new Approval Test always fails the very first time that it gets executed. After the initial test run, a PDF file is generated that needs to be visually verified and approved. So when we first run the generateRickRollPdf test, a file with the name SingAlongPdfGeneratorTests.generateRickRollPdf.received.pdf appears which has the following content:

A generated PDF file that needs to be approved

After we verified the PDF document, we approve it using the following command:

mv ~/src/test/resources/pdfapproval/SingAlongPdfGeneratorTests.generateRickRollPdf.received.pdf 
~/src/test/resources/pdfapproval/SingAlongPdfGeneratorTests.generateRickRollPdf.approved.pdf

At this point we have a PDF file named SingAlongPdfGeneratorTests.generateRickRollPdf.approved.pdf. When we now execute our test again, it passes as the newly generated PDF document matches the approved PDF document.

Note that we also have to make sure to commit the approved PDF file alongside our test code. Otherwise, we have to approve the output of the test again when executed on another machine.

Let’s say that we want to make a change to the implementation of our SingAlongPdfGenerator class. For example, we’re going to make a small change to the text inside the paragraph.

document.add(new Paragraph("Let's sing-a-long shall we?"));

When we execute our test again, it fails due to the change that we’ve made.

Failed Approval
  Approved:~/src/test/resources/pdfapproval/SingAlongPdfGeneratorTests.generateRickRollPdf.approved.pdf
  Received:~/src/test/resources/pdfapproval/SingAlongPdfGeneratorTests.generateRickRollPdf.received.pdf

The test created a “received” PDF file again alongside the existing PDF file that we’ve approved earlier. We now have to visually compare the received and the approved file side-by-side. Needless to mention that this is going to be quite cumbersome. For this reason I’ve added the capability that in case of failing test, a third PDF file is generated that contains the annotated differences between the two PDF files.

A PDF file that indicated the differences

The purple markers indicate where the changes are located in the document. The green colon indicates what is expected (approved). The red text that we’ve added to the paragraph shows the actual text found in the document (received). We now have to decide whether we want to approve the changes that we’ve made or not. Let’s say that we are happy with the change that we’ve made. To do that we simply run the approval command again as shown earlier. And that’s it.

With just a handful of these Approval Tests I was able to eliminate all the tightly coupled tests. Generating PDF files is more often than not an infrastructure concern. Sociable tests tests are much more appropriate in this case compared to solitary tests. Using Approval Tests this way turned out to be a very valuable approach.

If you and your team want to learn more about how to write maintainable unit tests and get the most out of TDD practices, make sure to have look at our trainings and workshops or check out the books section. Feel free to reach out at infonull@nullprincipal-itnull.be.

Jan Van Ryswyck

Thank you for visiting my blog. I’m a professional software developer since Y2K. A blogger since Y2K+5. Provider of training and coaching in XP practices. Curator of the Awesome Talks list. Past organizer of the European Virtual ALT.NET meetings. Thinking and learning about all kinds of technologies since forever.

Comments

Writing Maintainable
Unit Tests

Watch The Videos

Latest articles

The Five Underplayed Premises Of TDD

August 8, 2025
Contract Tests - Parameterised Test Cases

June 28, 2023
Contract Tests - Abstract Test Cases

April 12, 2023
Contract Tests

February 1, 2023
The Testing Quadrant

June 15, 2022

Disclaimer

The opinions expressed on this blog are my own personal opinions. These do NOT represent anyone else’s view on the world in any way whatsoever.

About

Thank you for visiting my website. I’m a professional software developer since Y2K. A blogger since Y2K+5. Author of Writing Maintainable Unit Tests. Provider of training and coaching in XP practices. Curator of the Awesome Talks list. Thinking and learning about all kinds of technologies since forever.

Contact information

(+32) 496 38 00 82

infonull@nullprincipal-itnull.be

principal-it.eu
principal-it.be
janvanryswyck.com

Blog

Approval Tests For PDF Document Generation

Jan Van Ryswyck

Comments

Writing Maintainable Unit Tests

Latest articles

Tags

Disclaimer

About

Latest articles

Contact information

Writing Maintainable
Unit Tests