About Books Credits Photos Software Rumblings Travelling Home
The importance of repeating tests

In the last few months I have been fighting a battle against the “old school” of software development.

By “old school” I mean the mentality that relegates testing at the end of the development cycle, that believes that there is such a thing as too much testing and that generally runs the tests once per release just so that a bunch of reports can be generated.

I don’t have to repeat just how fatal for a project’s success the “tests at the end” method is. The least that will happen is a significant release delay, followed by insufficient test coverage (because a deadline has to be met) that will lead to failures on the field, which in turn will get you angry customers.

Now, it might be OK for Microsoft to get a few angry customers. Most web applications do live with errors and so does most consumer software.

But when you work in a safety critical area and with embedded software devices that manipulate equipment with a weight measured in tons, then the word fatal becomes relevant for more than your project’s success.

Luckily for me, we have a continuous integration system in place that runs about 90% of our tests every night. So it has been relatively easy to get hard numbers that prove the usefulness of our “excessive” testing.

One very interesting aspect was the discovery of a couple of nasty race condition errors. It started with the random failure of tests that, when run during the day, produced no errors. It was very easy to attribute them to random test rig failures (since the test rig itself initially suffered from a few problems).

Correspondingly it was very difficult to reproduce the error, since it stubbornly refused to appear when a tester ran the failed tests again.

Now, at the time, we had a stable build (almost no changes in the functionality where effected in a 10 day span) and we were working at ironing out bugs.

Looking over the test logs for the nightly tests over that period of time – after we had worked out all our known bugs – revealed a pattern to the “random” failures, a combination of factors that allowed us to produce new tests that pinpointed the error (and ofcourse verified our fix).

Had we not repeated our tests so many times, our release would still contain those bugs. In fact, we wouldn’t even know there was a problem until it was too late and then it would be even more difficult to figure out where the problem lies.

The lesson learned from all this is that not only must all tests run succesfully, but they must run succesfully in a consistent way.

No build can be considered stable unless it’s test record is consistent.