TDD: Reintroducing the Testing API

Michael T. Andemeskel

Published in

Level Up Coding

12 min readDec 29, 2023

Summary

Don’t access the code you are testing directly from your tests; use an abstraction layer, i.e., a testing API (TAPI).
TAPIs are critical to managing testing overhead.
TAPIs decouple tests from implementation.
TAPIs allow easier and quicker test refactoring.
TAPIs allow you to write tests faster for a relatively small upfront cost.
TAPIs prevent the fragile-test problem, i.e., one change breaks multiple tests requiring modifications in multiple places.
TAPIs are extendable, allowing you to build more complex tests without increasing the difficulty of reading and maintaining tests.
You can easily create TAPIs by organizing all your test helpers and matchers into distinct but cohesive classes/objects.

If you want to know what led me to TAPIs, scroll to the bottom.

What is a TAPI?

To expand on Uncle Bob’s laconic definition (Clean Architecture, Chp 28), a testing API (TAPI) is an abstraction between your tests and the behavior being tested. TAPIs allow you to construct tests that depend not on the code being tested but on the TAPI. The tests use the TAPI to call and manipulate the code. Critically, TAPIs encapsulate all the logic needed by the test in one class, which lowers the cost of additional tests and updates. That’s the magic behind TAPIs; they let you write more tests at a lower price per test by removing duplication and creating a helpful abstraction. A good rule of thumb is that your test and code size should be roughly equivalent, but most projects are majority tests (remember the old adage, a programmer spends 80–90% of their time writing tests?). This is due to rampant duplication and coupling, poor or non-existent abstraction, and disorganization. All of which is caused by exposing tests to code being tested.

The best place to learn and get used to TAPIs is in the UI. So, I will focus on UI TAPIs, but you can use TAPIs everywhere: in API tests, DB tests, business logic tests, and most importantly, E2E tests.

The Test Case

Let’s say you have a registration page — the user enters a few bits of information and clicks submit to register an account. The first version only requires the user to enter a username and password. But eventually, you want to support federated logins, i.e., register and log in with Google, Microsoft, and social media accounts. The last step is to allow users who have paid for a product key to skip the registration process. They can use the product key to register an account, and the page will set up their account using the information linked to the product key.

This page’s first draft of tests will check the most straightforward use case to test, registering with a username and password. It may look something like this:

describe('registration page', () => {
 it('lets the user register', async () => {
   const page = render()

   page.getByCss('#username').fill('tester')
   page.getByCss('#password').fill('bad_password')
   page.getByCss('#submit').click()

   await page.getByCss('#progress-bar').waitForElementToDissapear()

   expect(page.getTitle()).toEqual('Registered! Welcome tester.')
 })
})

This test renders the page, gets the form inputs and fills them, then clicks the submit button, waits for the progress bar to be hidden, and checks if the user is notified that registration was a success. Nothing is wrong with these tests; you can get very far with them. Now, of course, there are improvements to be made here: we change the update message (the implementation) and break this test even though the registration process has not changed (the behavior). The IDs of the inputs and submit button may change (implementation), breaking the test while the code is still working. But for now, this is a perfect starting place.

Refactoring into Helper Functions

An experienced tester would first refactor the test and encapsulate the form-filling and submission logic in helper functions. Why? Because we are not done testing, we still need to test the edge cases, e.g., what happens when the user omits the username and password? What if they enter a duplicate username? All these tests require us to fill out the form and click submit; the experienced developer can anticipate and avoid all the duplication these tests will create.

describe('registration page', () => {
 it('lets the user register', async () => {
   const page = render()

   fillForm(page, { username: 'tester', password: 'bad_password' })
   await submitForm(page)

   expect(page.getTitle()).toEqual('Registered! Welcome tester.')
 })
})


function fillForm(page, { username, password }) {
 page.getByCss('#username').fill(username)
 page.getByCss('#password').fill(password)
}

async function submitForm(page) {
 page.getByCss('#submit').click()
 await page.getByCss('#progress-bar').waitForElementToDissapear()

We create two functions, fillForm, and submitForm, to hold the username and password, filling and submit button click, respectively. There are two separate functions because:

Filling the form and submitting the form are two different activities.
We want to test form validation, i.e., ensuring the username is unique, the password is strong, etc., independently of the registration logic (what happens after the submission).
This is because the form validation logic is separate from whatever logic happens after the click.
It is better to prevent an error (by preventing bad input) vs showing an error (letting the user enter bad input and returning an error response).
We want to test what happens when the form is NOT filled, and the user clicks the submit button.

Creating helper functions like this is a common pattern in testing. The key is to take it further and coalesce all these helper functions into one interface (the I in TAPI). These helper functions will become that powerful abstraction layer that will let us write every subsequent test faster, with more confidence, and with less dependence on the details of the UI.

Using the Helper Functions

Now, let’s add the missing tests using the helper function.

// ...
it('disables the submit button when the username is empty', async () => {
   const page = render()

   fillForm(page, { username: '', password: 'bad_password' })

   expectSubmitButtonToBeDisabled(page)
 })


 it('disables the submit button when the password is empty', async () => {
   const page = render()

   fillForm(page, { username: 'tester', password: '' })

   expectSubmitButtonToBeDisabled(page)
 })


 it('prevents duplicate usernames', async () => {
   const page = render()

   fillForm(page, { username: 'tester', password: 'bad_password' })
   await submitForm(page)

   const page2 = render()
   fillForm(page, { username: 'tester', password: 'bad_password' })

   expect(
     page2.getByCss('.alert').textContent()
   ).toContain('Username has been taken.')
   expectSubmitButtonToBeDisabled(page2)
 })

// ...

function expectSubmitButtonToBeDisabled(page) {
 expect(
   page.getByCss('#submit').disabled()
 ).toBeTruthy()
}

I’ve added three more tests demonstrating the usefulness of the helper functions. I’ve also created a new helper function, expectSubmitButtonToBeDisabled, that removes duplicate asserts — yes, helper functions can contain not only code to interact with the UI but also to test it. Testing is often a form of interaction, so it should be hidden behind the TAPI. At this point, I’m ready to create a TAPI.

Why now? I’m done with the simple tests, and I know the future tests will be more complex and cover different use cases. I know from experience that each use case’s tests should be organized separately in different files or even folders — this helps future developers know exactly where to go to update tests and which tests to run.

Avoid cramming all your tests into one file; it is very tempting because if you have one registration page, your brain automatically thinks you must have one test file. That’s a mistake. You have one implementation (the registration page) that hosts multiple behaviors (register by username/password, register by social media, and register by product key). Each behavior deserves the respect of a separate file.

Why not just use helpers?

The point of a TAPI is to make testing easier and faster and give you more confidence in writing future tests. That’s what abstractions do best; the trade is increased complexity and obfuscation. For a little more code and misdirection, abstractions provide a convenient way to do something. But why a class? Why not just a collection of helpers in a separate file?

That’s what most people do. There is nothing particularly wrong with putting every helper and custom matcher into a file like a child haphazardly shoving toys and knickknacks into their footlocker. Jokes aside, this disorganization has a cost; abstractions work when they are organized, focused, and single-purposed. A helper file that exports 20 functions is not a helpful abstraction.

It is a hairy ball of dirty coupling that is hard to maintain and use. With a class, you get the benefits of OOP, i.e., you can later take the RegistrationTapi and subclass it into FederatedRegistrationTapi and ProductKeyRegistrationTapi. Classes allow the developer to precisely construct abstraction by dissecting responsibilities into decoupled but cohesive parts. Collections of functions exported from random helper files prevent you from cleaning and isolating your tests like a class can. Of course, OOP and inheritance are not required; you can create a TAPI with plain old objects and dependency injection — that’s how I started.

TAPI First Draft

The first draft of our TAPI will be a simple class that takes the render function as an argument in the construct and exposes the helper functions we’ve already built.

export class RegistrationTapi {
  public page

  constructor(render) {
    this.page = render()
  }

  fillForm({ username, password }) {
    this.page.getByCss('#username').fill(username)
    this.page.getByCss('#password').fill(password)
  }

  async submitForm() {
    this.submitBtn().click()
    await this.page.getByCss('#progress-bar').waitForElementToDissapear()
  }

  submitBtn() {
    return this.page.getByCss('#submit')
  }

  expectSubmitButtonToBeDisabled() {
    expect(this.submitBtn().disabled()).toBeTruthy()
  }
}

Pretty simple. Putting the render function inside the registration TAPI eliminates the last bit of duplication in our tests. I made the page object a public property for one-offs where you need to manipulate the page directly. Each helper function is a public method. I’ve also removed the duplicate calls to get the submit button with a submitBtn method — you can memoize this, but depending on your testing library, this may cache and return old versions of the button and cause your tests to be flaky.

import { RegistrationTapi } from "./registration_tapi"


describe('registration page', () => {
  it('lets the user register', async () => {
    const tapi = new RegistrationTapi(render)

    tapi.fillForm({ username: 'tester', password: 'bad_password' })
    await tapi.submitForm()

    expect(tapi.page.getTitle()).toEqual('Registered! Welcome tester.')
  })


  it('disables the submit button when the username is empty', async () => {
    const tapi = new RegistrationTapi(render)

    tapi.fillForm({ username: '', password: 'bad_password' })

    tapi.expectSubmitButtonToBeDisabled()
  })


  it('disables the submit button when the password is empty', async () => {
    const tapi = new RegistrationTapi(render)

    tapi.fillForm({ username: 'tester', password: '' })

    tapi.expectSubmitButtonToBeDisabled()
  })


  it('prevents duplicate usernames', async () => {
    const tapi = new RegistrationTapi(render)

    tapi.fillForm({ username: 'tester', password: 'bad_password' })
    await tapi.submitForm()

    const tapi2 = new RegistrationTapi(render)
    tapi.fillForm({ username: 'tester', password: 'bad_password' })

    expect(
      tapi2.page.getByCss('.alert').textContent()
    ).toContain('Username has been taken.')
    tapi2.expectSubmitButtonToBeDisabled()
  })
})

Improve Isolation from Implementation

The first draft of the TAPI is not done yet. We directly access the page object in a few places — where we get the page title and the alert text. Those should be abstracted as well.

    expect(tapi.page.getTitle()).toEqual('Registered! Welcome tester.')
// ...
    expect(
      tapi2.page.getByCss('.alert').textContent()
    ).toContain('Username has been taken.')

In these two cases, I am testing what happens after logging in and after entering a duplicate username. The issue is that the tests are directly dependent on the notifications being in the page title or an HTML element with an `alert` class — implementation details that don’t matter; we can show these messages in numerous other ways. UI is the least stable part of the code; while the behavior might not change, the look and feel of the UI will change, often radically and unpredictably. This will make your tests fragile and hard to update if they depend directly on how the UI is implemented. So, I will do two things:

Encapsulate and abstract by moving these tests into their own matchers in the TAPI — expectLoginMessageToBeShown and expectDuplicateUsernameWarningToBeShown
Isolate and generalize by checking the entire page for the `Username has been taken.` warning instead of a specific element. This isolates our test from the CSS implementation. Often, the best way to make your tests less fragile is to make them more generic.

Now we have two new matchers in the TAPI.

  expectLoginMessageToBeShown() {
    expect(this.page.getTitle())
      .toEqual('Registered! Welcome tester.')
  }


  expectDuplicateUsernameWarningToBeShown() {
    expect(
      this.page.textContent()
    ).toContain('Username has been taken.')
  }

And we update our tests.

  it('lets the user register', async () => {
    const tapi = new RegistrationTapi(render)

    tapi.fillForm({ username: 'tester', password: 'bad_password' })
    await tapi.submitForm()

    tapi.expectLoginMessageToBeShown()
  })
// ...
  it('prevents duplicate usernames', async () => {
    const tapi = new RegistrationTapi(render)

    tapi.fillForm({ username: 'tester', password: 'bad_password' })
    await tapi.submitForm()

    const tapi2 = new RegistrationTapi(render)
    tapi.fillForm({ username: 'tester', password: 'bad_password' })

    tapi2.expectDuplicateUsernameWarningToBeShown()
    tapi2.expectSubmitButtonToBeDisabled()
  })

We can take this even a step further and remove all CSS in the TAPI; that is the ideal, and if your testing framework supports accessing HTML through text, roles, and labels, i.e., what the user sees, then do it. Playwright does this, but it is a bit clunky, and not all the edge cases are handled. Now you may be wondering, isn’t testing using the text in the page a form coupling on the implementation? Yes and no, it is coupling, but it is coupling on the behavior portion of the implementation.

There’s the implementation that is behind the scenes, and there is a thin layer of implementation that is exposed to the user, i.e., the behavior. When you test against CSS, you are testing below this layer; when you test against the UI that the user interacts with, you are testing the implementation. Rule of thumb: if you are testing against something the user will see and interact with, like a warning that says `Username has been taken,` then you are coupling to behavior, which is the point of testing. If you are testing against only something you see, i.e., the HTML tags, CSS, or JS that shows and styles that warning, then you are testing implementation.

Next: Extending our TAPI to Test More Use Cases

In the next blog, you will see the true power and utility of a testing API by extending RegistrationTapi to test multiple use cases. Then, we will finish by making the ultimate abstraction, isolating our tests from the testing framework. This may seem extraneous, but the point of testing is not to adhere to a testing framework; it is to write better code faster and with more confidence. If you are in a rapidly changing environment like I am, JS/TS, then you know testing frameworks will come and go; the last thing you want is to be stuck with a testing framework that is not maintained or, as in the example of Jest, does not support the latest version of your language (Jest doesn’t support ESM without a lot of pain, bugs, and missing features, although they are making good progress on that complex migration).

Tests are ultimately where most of the value of your work resides; the implementation will and should change rapidly. The behavior should change only when your business needs change. Therefore, your tests need to outlive your code and be as long-lived as your business needs. To accomplish this longevity, you need to decouple from not only implementation but testing frameworks as well. TAPIs can help you do that.

Background: But why?

At Stitch Fix, I spent hundreds of hours maintaining and updating an app responsible for purchase orders. It was not technically complex. The complexity came from the years of business logic embedded in the system, the complexity of the purchasing processes — which took months if not a year or two — and the multiple teams working on the app. In turn, the tests were equally as complex. Hundreds of lines of setup, teardown, and tests all in single files — the purchase order model’s test file had several thousand lines of code and dozens of tests!

Testing was absolutely miserable, but I and the senior members of my team shrugged it off. Like everyone else in the industry, we accepted the old adage that a developer spends 90% of their time testing and 10% writing code. Our codebases reflected this; there were more lines of tests than code. This is TDD, after all. A small sacrifice.

Now that I’m working alone. I can’t bear spending so much time writing code that doesn’t add features, but unlike other startups, we can’t afford to skimp on testing. At first, my solution was to skip unit and integration tests and focus on end-to-end (E2E) tests, which did lower the number of tests. But E2E tests are sledgehammers.

They are hard to set up and teardown because they touch every part of the system. Databases need to be created, filled, and destroyed. Services need to be started or mocked. Frontends need to be built.

They lack precision; if an edge case in a database transformer exists, you can’t test it easily with an E2E from the frontend. imagine setting up a server, database, and frontend to ensure your models return empty relationships as empty arrays. E2E tests by themselves were not the answer. I needed to make my tests easier to set up, run, and teardown. How?

The answer was to use testing APIs (TAPIs), and I found it in Uncle Bob’s Clean Architecture.

[A] specific API that the tests can use to verify all the business rules. This API should have superpowers that allow the tests to avoid security constraints, bypass expensive resources (such as databases), and force the system into particular testable states. This API will be a superset of the suite of interactors and interface adapters that are used by the user interface.
The purpose of the testing API is to decouple the tests from the application.

Clean Architecture, Chp 28

Unfortunately, Uncle Bob does not expand on this paragraph. Nonetheless, after lots of trial and error and reading Kent Beck’s masterpiece Test-Driven Development by Example to relearn TDD, I have come up with something that works well. Something I can explain to future teammates. Something I can distill into simple rules anyone can follow i.e. a habit like TDD.

And yes, I tried using ChatGPT to write tests. It worked for unit tests; as long as I reviewed the tests and only asked for simple tests, there were often hidden logic bugs and typos. But I don’t want unit tests! I want E2E tests, but sadly, I couldn’t figure out how to get it to write those. I had more success in asking it to write code for existing tests, but passing tests is the best part of TDD; why rob myself of that joy?

Sources/Inspiration

Clean Architecture: A Craftsman’s Guide to Software Structure and Design, Robert Cecil Martin “Uncle Bob” — Chapter 28
Test-Driven Development by Example, Kent Beck — the whole thing!