Code Quality, Not Test Coverage

7 minute read

It’s easy to look at a metric like test coverage and equate high test coverage with high code quality. But it’s just not that simple. Test coverage and other “code quality” metrics are just heuristics that can be indicators of code quality. That does not mean they equate to high code quality.

Software Testing

Image by PCH-Vector on iStock by Getty Images

Scenario: The Code Coverage Mandate

BigTime Software Co (BTS) has a new code quality initiative. The company’s IT executive management team recently read Unit Testing for Dummies and test coverage is now the metric de jour. So much so, that the code quality initiative really boils down to code quality = (test coverage >= 75%). In other words, if a project/module/widget has test coverage greater than or equal to 75% then it is considered to have high quality code, and it would be compliant with the initiative. Additionally, if any given area is over 75% compliant with the code quality initiative, then the executive in charge of that area gets a bigger bonus, and he or she gets to talk about how their area has super high quality code at the next IT all-hands meeting.

The problem, of course, is that not only is test coverage for existing software virtually non-existent at BTS, but BTS developers really have no idea how to write high-value tests. The other problem is that unit testing (even when done well and with high coverage) does not necessarily equate to high quality software. But that is way outside the scope of Unit Testing for Dummies, so, for the moment, and for the remainder of this post, we’ll stick to just unit testing.

So, what are developers, who are unfamiliar with how to unit test, but who want to be compliant with the code quality initiative to do?

Here are some things that have actually happened in this type of scenario.

Testing For Coverage And Only Coverage

It’s one thing to claim test coverage by writing high value tests that evaluate system logic. It’s quite another thing to write tests against do-nothing code to boost test coverage. And yes, some people do advocate for that :’(

The problem with writing a lot of tests against code that does virtually nothing is that it skews code coverage higher at the expense of valuable tests - the tests that actually verify your code does what it’s supposed to do. To illustrate this, let’s take the example of a Java Bean class.

public class User {
  private String id;
  private String firstName;
  private String lastName;
  
  public String getId() {
    return this.id;
  }
  
  public String setId(String id) {
    this.id = id;
  }

  public String getFirstName() {
    return this.firstName;
  }

  public String setFirstName(String firstName) {
    this.firstName = firstName;
  }

  public String getLastName() {
    return this.lastName;
  }

  public String setLastName(String lastName) {
    this.lastName = lastName;
  }

  public String getLastName() {
    return this.lastName;
  }
}

While it’s true that there are methods in the above User class that could be tested, it’s also true that those methods don’t do much. It’s also very likely that those methods were generated by an IDE based on conventions. Testing those methods would boost overall test coverage, but one could argue that it would also be a waste of time. A better (and more honest) approach might be to exclude those kinds of classes from test coverage, altogether.

**Note: Testing code like the Java Bean example shown above may be done tangentially through other tests, which may also help boost test coverage. That’s not a bad thing. However, dedicated low-value tests should be omitted.

1. Matching Test Results To The Code

This is also known as backing tests into the code. Instead of writing tests that evaluate the expected functionality, you can write tests based on what the code is doing. That way, all your tests are certain to pass, and your test coverage goes up. It’s a win-win, right?

Seriously, never do this.

2. Asserting True

This is similar to matching test results to the code, but even lazier. Basically, you write tests to achieve the desired coverage, but you can’t be bothered (or you don’t know how) to effectively evaluate the results of those tests. So, you just end the tests with a true assertion, ensuring that coverage is elevated and that all tests will pass.

3. Changing Coverage Reporting

This one is kind of interesting. But it wouldn’t be on here if it didn’t actually happen in the wild. Instead of reporting on total coverage, you can report on the test coverage of a small subset of the code, like just the code changed in the last commit. The logic is undeniable: supply an arbitrary metric for an arbitrary benchmark. It definitely gets points for creativity.

What Are Unit Tests?

Test Pyramid

The Test Pyramid

Unit tests are the most granular form of tests. They test software at an individual software unit level. In the case of object-oriented programming, often a unit can be equated with a class. Unit tests make up the largest body of tests on the test pyramid - there are a lot of them and they can be run fast, collectively (think 100+ tests in under 5min).

The most important attribute of a unit test is that it can be run fast. The second most important attribute is that a unit test can be run in isolation. There are various opinions on what “isolation” means. One extreme argues that isolation means testing no more than one “unit” (e.g. class) of software at a time. However, we prefer a more broad definition where isolation means not having to depend on any external systems. By that definition, things like testing with a packaged in-memory database and testing calls to web services could still be considered unit tests.

For a more in-depth discussion on unit tests, check out Martin Fowler’s UnitTest Post.

Making [Unit] Tests Matter

As stated above, test coverage can be an indicator of code quality. But it’s only a valuable indicator if tests that makeup that coverage are meaningful and true to the expected functionality. If you’re building a new feature then the tests should support the acceptance criteria. If you’re fixing a bug then tests should exist to reliably evaluate that the bug no longer exists. The purpose of tests is not to check a box. The purpose of tests is to build confidence in the quality of the software.

With that in mind, here are some things to focus on that can make your unit tests more valuable and meaningful.

Focus On The Functionality

Although unit tests are granular by nature, they still test functionality. That’s what you want to focus on - isolating the functionality into different tests, which is much more important than the dogmatic isolation of classes or components. A good question to continuously ask while writing unit tests is “How does this validate that my {software unit} does what it’s supposed to do?” Another great question to continuously ask is “How do I test known exception scenarios?”

Simplicity And Clarity

Unit tests are code. And like all code, simplicity and clarity are important. If you are unable to write simple tests that are self-explanatory then that’s a smell indicating the code you are testing may be overly complex.

Changing Code And Testing Go Hand-in-Hand

If you are changing code, then odds are, you should be changing tests. If you aren’t, then that could mean you didn’t previously have sufficient testing in place. If that’s the case, then there is an opportunity to add new tests!

Making Test Coverage A Code Quality Metric

From the (fictitious, but all too common) scenario above, we can see that high test coverage does not equal high code quality. If the decree is that “you must report x number” then you will get x number. You might not get better software, but you’ll get a number.

If better software is desired then better tests must be created, and a shorter feedback loop must be established. Some of that is growth, which is achieved through a culture of continuous improvement. But some of it can be prescriptive. Regular code reviews based on small integrations, an agreement to consistently improve test coverage where it is lacking, and a focus on the trend instead of an arbitrary benchmark are some examples. These are just a few things that can go a long way towards making test coverage indicative of quality instead of being just a number.