What Kinds of Tests Should I Be Writing?

TL;DR If you build web applications, prefer writing integration tests first because they provide the best return for your team’s time. Integration tests make it easier to change your software, onboard new developers, and empower junior folks to start contributing early. Isolated unit tests become useful when integration tests offer poor experimental control over the behavior of your code.

Why write tests at all?

Before you can decide what types of tests you should be writing, it’s important to talk about the point of writing tests in the first place. For one thing, a good test suite will tell you when you’ve broken some existing behavior in the system. Preserving existing behavior in a growing code base is a difficult task. There are a couple of ways¹ to approach the endeavor:

Edit and Pray

You carefully plan the changes you would like to make, attempt to fully understand the code you’re going to change, make the change, and manually test the changes to make sure nothing is broken.

Cover and Modify

Cover the existing code with automated tests that verify the system behaves in the way you expect it to. With this safety net in place, you can start making changes without fully understanding the implementation. You now have a tight feedback loop to work within and you can feel comfortable knowing that the tests are there to back you up when you’ve done something wrong.

Cover and Modify is preferable. Writing automated tests to verify that your code behaves the way you intend it to empowers your team to change the software. Teams that rely solely on manual testing won’t have the confidence to make changes and progress will grind to a halt.

So it’s clear automated tests are valuable, but we still don’t know what kind of tests to write and why.

There are two types of tests in this world

Testing is a huge topic. The more you read about testing “best practices”, the more confusing it becomes. Should I be writing end-to-end tests? Regression tests? Acceptance tests? Integrations tests? Unit tests? To cut through some of the noise, I’m going to act as if there are really only two broad categories of tests: unit tests and integration tests. Your understanding of both of those phrases might be different than mine, so let’s define some terms:

Unit tests exercise a bit of code in isolation from all other components in the system. For example, let’s say you’ve written this function in Go:

func SearchCollection(collection []string, searchTerm string) string {
  //...
  return found
}

It’s a pure function, meaning that it takes an input and produces an output without side-effects like making network calls or talking to a database. It’s not too painful to write an isolated unit test for the function:

func TestSearchCollection(t *testing.T) {
  collection := []string{"Michael Feathers", "Justin Searls", "Adam Wathan"}
  searchTerm := "Feather"

  want := "Michael Feathers"
  got := SearchCollection(collection, searchTerm)

  if got != want {
    t.Errorf("got %s, want %s", got, want)
  }
}

Unit tests run quickly since they don’t have any side-effects. Their speed enables a tight feedback loop. Make a change, run the tests, repeat.

They also provide error localization. As tests get further away from the code they exercise, it’s harder to figure out where things went wrong if the tests fail. You’re stuck tracing values from the edge of your system to the core, trying to figure out at which point the code did the wrong thing. Error localization helps you by surfacing errors closer to the problem.

The common thread here is that programmer time is expensive and we should be taking steps toward reducing the time it takes for programmers to get useful work done. Tests should be fast and errors should be easier to fix so we can spend more time on the problems we’re trying to solve.

This is all great, but we’re still ignoring some real costs if we default to writing only isolated unit tests:

They don’t ensure that the whole system works together. They only confirm that individual parts work in isolation.
In my experience they actually require more programmer time to implement. Isolated unit tests require you to think about and design the right abstractions up front. Designing abstractions is often a waste of time because your understanding of the problem domain will inevitably change. Abstractions should be allowed to emerge naturally, but I think that’s a topic for another post entirely.

Integration tests tie together multiple independent pieces of the system in order to verify behavior at a higher level of abstraction. They allow you, the developer, to exercise large chunks of code in order to check whether or not changes to the code in one place affect the behavior of the system in another. Here’s a Rails controller that’s meant to search for people in the database:

class SearchesController < ApplicationController
  def index
    @person = Person.where("name ILIKE ?", params[:name])&.first
  end
end

And the test:

class SearchesControllerTest < ActionDispacth::IntegrationTest
  test "performing a search" do
    # some data setup, maybe assigning variables based on fixtures
    get searches_path, params: { searchTerm: "Michael" }

    assert_select 'ul.search-results' do
    assert_select 'li.search-result', "Michael Feathers"
    end
  end
end

The integration test doesn’t look too structurally different from the unit test, but it gives us a whole lot of power. Once the tests passes, we know that all of the code that executes in the entire request/response cycle can be refactored, re-designed, thrown out, replaced, whatever. By writing tests at the integration level, we can be more confident that the entire stack of pieces work together in order to search for the person we’re looking for, and display their name on the page. With just one test, our level of code coverage is as high as it would be if we tested each component individually, and we have a better idea about how the application will work for real users.

There are trade-offs, of course! We lose some test speed since we’re sending real HTTP requests and the potential errors might not be as localized. Most (not all!) of the time though, trading those things for the high confidence and flexibility that integration tests give us is worth it.

Asking the right question

“What kinds of tests should I be writing?” will get you different answers depending on who you ask. It leaves too much up to personal preference and doesn’t immediately force you to consider some more serious trade-offs.

Let’s re-frame the question to something like: “how can we enable the widest possible set of people to deliver software in a reasonable amount of time”.

By putting emphasis on writing isolated unit tests, we raise the barrier for junior developers be able to write “good” code. They require a more sophisticated understanding of software design that isn’t easy to teach or to learn in a short period of time.

Integration tests allow you to write code that you can be confident about shipping, even without a sophisticated understanding of how to structure software systems. I’m not saying that we shouldn’t think more deeply about how our software is designed. Using integration tests as the primary way to verify that the system works requires you to accept that your current understanding of the system is fragile, and will inevitably change. And that’s okay! Testing at a higher level of abstraction allows you to change the underlying design of the code when you and your team become more familiar with the domain in which you’re working.

When should you write unit tests?

The wrong message to take away from this would be that it never makes sense to write unit tests. It’s important to know when the scales tip in favor of isolating code within more rigid boundaries.

Tests should become more granular and isolated in order to improve experimental control² of your code. Here’s an example to show you what I mean:

Let’s pretend that we’re building an application that allows customers to upload hundreds of pieces of data to feed into our proprietary Algorithm. Here’s what a (contrived, non-idiomatic) Rails controller that runs the algorithm might look like:

class AlgorithmController < ApplicationController
  def run
    @inputs = Input.where(user: current_user).pluck(:value) # gets the raw data to feed into the algorithm
    @insight = Algorithm.run(@inputs)

    redirect_to some_path
  end
end

We started with writing an integration test to verify the behavior of this controller, but we ran into an issue: it’s difficult to simulate the scenarios in which we wanted the controller to behave it certain ways. There are many different permutations of input data that will create different results in our algorithm. To solve this problem, we can create a boundary around the algorithm, unit test it, and mock out the algorithm in the controller test.

Here’s what the unit test might look like:

class TestAlgorithm < Minitest::Test
  test "one permutation" do
    Algorithm.run([[1,0,0,1,1,1,0], [0,1,1,0,0,1,0]])
    # assert something
  end

  test "another permutation" do
    Algorithm.run([[0,0,0,1,0,0,0], [1,1,0,0,0,1,1]])
    # assert something
  end

  test "yet another" do
    Algorithm.run([[0,0,0,1,1,0,0], [0,0,0,0,0,0,0]])
    # assert something
  end
end

And the integration test:

class AlgorithmControllerTest < ActionDispatch::IntegrationTest
  test "when the algorithm returns a number below 5" do
    Algorithm.stub :run, 4.32 do
      post algorithm_path
      assert_response :success
    end
  end

  test "when the algorithm returns a number above 5" do
    Algorithm.stub :run, 6.19 do
      post algorithm_path
      assert_response :error
    end
  end
end

What we’re doing by creating this boundary is getting more granular control over how we test the code in order to more easily verify the behavior.

Isolation is not an inherently valuable trait of your code³ - it’s a tool that you can use to get fine-tuned control of your code when you need it. Needing that control is the exception, not the rule.

Michael Feathers has many valuable thoughts on how code changes over time in Working Effectively with Legacy Code ↩
Justin Searls’ talk, Please Don’t Mock Me introduced me to the concept of “experimental control” of code. ↩
Adam Wathan’s talk, Lies You’ve Been Told About Testing heavily influenced how I think about integration vs. unit tests. ↩