Writing great backend unit tests

Unit tests are your first line of defence against bugs or regressions that could cost your business its money, customers and reputation.

I wanted to consolidate some best practices I've picked up into this article to help more junior engineers ramp up on this quicker. This will be broken into two parts:

How to think about unit testing
How to actually write great unit tests

All example code will be written in Java but the principles they illustrate should still be universal.

How to think about unit testing

1. Aim for state coverage, not code coverage

What is state coverage?

Code coverage measures how much of your code is covered by your tests. For example, 100% code coverage means your unit tests execute and cover all lines of code.

State coverage refers to how much your tests validate different states of your system. Different states may be brought about by different inputs or conditions.

Let's take the following example of code that divides two numbers.

public double divide(double numerator, double denominator) throws IllegalArgumentException {
    if (denominator == 0) {
        throw new IllegalArgumentException("Cannot divide by zero");
    }
    return numerator / denominator;
}

We could write two unit tests to reach 100% code coverage:

Testing the unhappy case where denominator == 0
Testing a happy case where we successfully perform division (e.g. 4/2)

Although we've tested all lines of code, we've missed checking how the system reacts to the following inputs:

Either the numerator or denominator are negative numbers
Both the numerator and denominator are negative numbers
The numerator is 0

The example provided is trivial, but you can see how easy it is to miss bugs in complex programs if you don't think about all the possible states it could enter into.

Caveats

You will have to use your own judgement but it's not always reasonable to test every possible state. At a minimum, we should test the most critical happy and unhappy paths.

Some reasons for this include:

It may not be feasible to test every state (e.g. if there are several inputs it's unlikely we can test every combination of them)
It may not be valuable to test minor edge cases that have minimal impact on the system
It may not be useful to test code that is extremely trivial (e.g. a method that has no logic and just returns a value)

We shouldn't completely dismiss code coverage either. I still believe it's a useful indicator to check if we've glaringly missed testing any lines of code. However, it shouldn't be the sole metric used to gauge quality of testing.

What are common ways state is missed?

To help you get familiar with this concept, the following are some common examples of how state coverage gets missed.

Not testing both null and non-null values when an input is nullable
Not testing variations of a collection input (e.g. a list with valid elements, a list with invalid elements, a list with a mixture of elements, an empty list)
Not testing invalid inputs (e.g. missing inputs, duplicated inputs)
Not testing boundary cases and edge cases
Not testing all possible error/exception cases
Not testing important branches in conditional logic (e.g. if-statements, switch cases)

2. Test state transitions

It's also worth thinking about testing state transitions involving a sequence of operations dependent on each other. 'Operations' in this context usually refers to different functions/methods which can be called.

Typically, these sequences:

Change the system's state with each operation and;
Can result in different outcomes depending on the order of operations

Let's clarify this way of thinking by looking at a few example scenarios.

Example 1. Writing a class which has separate methods to open a resource, do processing and close a resource

A class like this may be doing something like opening a file to do some processing and then closing the file.

What happens if you try to process or close before opening a resource?
What happens if you try to re-open, process and close on the same resource after it has already previously been opened and closed?

Example 2. Writing a financial program that deposits and withdraws from an account

What happens if you withdraw money before depositing any first?
What happens if you deposit money, and then try to withdraw less, equal to, or larger than the amount that you deposited?

3. Add tests everytime you fix a bug to prevent regressions

This gives us two assurances:

The bug should be fixed because the test fails without the fix, but succeeds with the fix
The bug won't show up again in the future because we've added a test to catch it

Similarly, tests give us confidence that future changes won't break existing functionality.

4. Write tests to document behaviour

Tests aren't just useful for verifying that code works how you expect it to at the time of writing it. They're also a great way to document the expected behaviour of code.

This is especially helpful in a shared codebase where you're trying to understand unfamiliar code that you didn't write, or are trying to recall the behaviour of code you wrote a long time ago.

How to actually write great unit tests

1. Structure individual test files logically

A massive, unreadable test file makes it more difficult to:

Uncover existing tests when they need to be modified
Uncover reusable code or helper functions
Grasp a high-level understanding of what has or hasn't been tested

These are some things you could do:

Parameterize your tests where applicable. Each unit test should cover one specific scenario. Parameterized tests allow you to execute the same unit test with different parameters (e.g. different inputs and expected outputs) to cover different states, meaning you don't have to keep copy and pasting the same code setup for similar scenarios. With JUnit, you would achieve this with the @Parameterized annotation.
Group related tests together. It makes it easier to scan the file when the tests are grouped by functionality. You could either put them close together in the file, or use something like JUnit's @Nested annotation to explicitly group them.
Structure the sections of your file consistently. For example, always putting helper functions at the bottom of the test file.

2. Structure unit tests with given-when-then

This is a convention that I like to use both when naming and structuring tests.

This is what each part means:

Given represents preconditions
When represents actions taken
Then represents expected outcomes

Let's look at the following example.

@Test
public void createAccount() {
  // ... unstructured test code
}

@Test
public void testDepositOnClosedAccount() {
  // ... unstructured test code
}

@Test
public void withdrawalWorks() {
  // ... unstructured test code
}

@Test
public void withdrawExceedsBalance() {
  // ... unstructured test code
}

Let's convert these tests to use the given-when-then convention.

@Test
public void whenCreateAccountThenSucceed() {
  // when, then
  // ... code to create an account and assert expected outcome
}

@Test
public void givenClosedAccountWhenDepositThenReturnError() {
  // given
  // ... code to set up initial state (create a closed account, set up any mocks)

  // when
  // ... code to perform action (e.g. deposit)

  // then
  // ... code to assert expected outcome
}

@Test
public void givenSufficientBalanceWhenWithdrawThenReturnSuccess() {
  // given
  // ...

  // when
  // ...

  // then
  // ...
}

@Test
public void givenInsufficientBalanceWhenWithdrawThenReturnError() {
  // given
  // ...

  // when
  // ...

  // then
  // ...
}

Each test becomes easier to scan because it describes a linear journey (i.e. precondition -> action -> output).

Of course, this is just a convention and is up to personal preference. One could argue that they find it too prescriptive or verbose.

However, I've found that using this structure forces you to be consistently deliberate and explicit with the naming and structure of your tests, which then makes it more difficult to write unclear tests.

3. Reduce code duplication with test fixtures and shared helpers

If there is duplicated code in multiple unit tests within the same file, consider extracting them into common helper function.

If there is duplicated code across multiple test files, consider extracting them out into a common helper file or test fixture (i.e. a file used explicitly for instantiating common test objects).