In this post I aim at providing a solid overview of what Unit Tests are and what TDD is. It is intended for people not familiar with TDD yet and definitely not for seasoned developers. I'm not presenting any fancy or advanced methods, instead, I'm focusing on explaining TDD in a way that is easy to understand. Also, if you want to learn what TDD is about but your role is different than a developer, you are welcome to read on watch the video. It should help you understand what it is, what its benefits are, but also how much discipline it requires to be done right.
What you need to know to get through the post
The example is implemented in Python. You should only need a very basic understanding of working with Python lists to be able to read it. When you see assertEqual in the code, it means that at this point the Unit Test framework checks if a condition is true. In assertEqual call the first argument is the expected value and the second is the actual value. If the equality test fails, the whole unit test will fail. As for the rest of the code, you can treat it is as a boilerplate.
In order to get the idea through, I'm using three communication channels: this post, a video of me repeating the TDD steps from the post and also Python files containing the result of several TDD cycles described here. You can use any combination of them and try them out at any order that suits you.
The TDD method
TDD requires a high level of discipline. There are certain rules and if you break any of them, you can't claim you do TDD. Even though it is fundamental and easily available knowledge, many people don't realize what those rules are, so let's look at them:
- We progress by writing one test at a time. This test will initially fail.
- We add implementation necessary only to satisfy that single test (cause it to pass)
- We can optionally spot and address refactoring opportunities and then we start over from (1)
To be more specific:
- One test at a time - yes, really only one. If you do more than one, you do something, but it's not TDD.
- Yes, it really means you cannot do any implementation that is not written specifically against a new, failing test. Adding implementation code because we will need it later invalidates the idea, similarly like writing a good deal of implementation and writing unit tests afterwards.
- Different kinds of tests (not unit tests, for example functional tests executed with Selenium or other) are outside of the TDD concept
It is not easy to always be mindful of these rules and never bend them. Some teams write unit tests, but they do not do by the book TDD and sometimes it is by conscious choice. It's important though to realize what TDD is and what it is not and don't make false claims.
What tests qualify as Unit Tests? Citing Michael Feathers and his fantastic book Working Effectively with Legacy Code we can say that Unit Tests run fast. If they do not run fast, they are not Unit Tests. More pragmatically, we must aim at the following qualities:
- our unit tests are not dependent on the execution environment
- our unit tests do not do any calls to other resources (files, web services, databases, IoT devices, etc.)
- the execution time of all of our unit tests is in the range of seconds, rather than minutes (there's much more to the execution time for products with huge codebase, but let's leave it as is here)
I'm not opposed to teams doing Unit Tests and yet not doing by the book TDD. I know it is hard, the concept does not feel natural and sometimes teams depend highly on other kinds of tests and only do a little of unit tests and even that not in the classic TDD cycle. But at least, let those tests be true Unit Tests. I mean, if you bend rules and don't do TDD, that's fine by me, but never, ever break the three rules above, because then you will break the very purpose of a Unit Test framework.
A working example
In order to write our very first Unit Test, we need to have something to test and we need to start with a very basic expectation: there is a list of recently opened files and at initialization the list is empty (no files).
Our list of files will have fileList method that will return the list of recently opened files.
Test:
Implementation:
So, we have an empty file list. What basic behavior we expect from the list of recently opened files? We can begin with something very essential: if we add one opened file to an empty list, the list will contain only this file.
We will assume that we only need to stores file paths. In order to let our RecentlyOpenedFiles object know that a file has been opened, we will provide it with a fileOpened method which will be called (by some application, e.g. an editor). When called, the fileOpened method is expected to add one file path to the list.
This test will fail until we add new logic to our implementation, so:
We move on to support more than one file in the list. Expectation: if file B is opened after file A was opened, then B is the first file in the list and A is second.
This expectation i s expressed by the test:
This test will fail (note that the two previous unit tests still pass), because there is no logic yet to put the most recent file at the top of the list. Let's add it:
What we have not tried yet is opening again a file that is already on the list. If we open file A, then B, then the list will read [B, A]. Then, if we open file A again, it must change its position to become the most recently open file: [A, B]. Expectation: if a file that is already on the recently opened files list is opened again, its position will change to the top of the list.
The test will fail, because our current implementation simply inserts each opened file to the top of the list, so the list will not even have the correct size. Let's try to get it right:
Let's call it a day. What if we want to limit the number of files in the list to, say, 5 files?
In our current implementation, there is no limit on the number of files, so the test will obviously fail (though the other four tests pass). We can get the correct behavior, with limits included, like this:
We can go further to check an edge case when a file that is already in the list is opened again, when the list is already at its limit of 5 files:
This test passes - it so happens that our implementation works correctly for this scenario.
Is the investment worth it?
If we stop here, we have 16 lines of implementation and 78 lines of tests (there are ways to make the tests more succinct, but this is outside of the scope of this post). So someone can ask a valid question - what is the benefit of going into all of that trouble? Yes, it definitely is. The long answer comes in three parts:
- as we build the logic of our implementation, the tests form a harness for us: the previously written tests will fail, if we inadvertently break the existing logic. The new test will not pass as long as our implementation does not match the expectation expressed in the new test.
- as long as our expectations are the same, we can reorganize the code, change class member names, in other words - refactor the code - and if any of the unit tests fails, it will be a signal for us that we broke something. We could, for example, change the _recFiles to be some other data structure than a simple list. As long as our class does not change on the outside, we can do it and expect unit tests to keep us safe.
- the time it takes to spot a mistake in the code is reduced dramatically. As we work on something using TDD, we only have a small delta of the code that we are changing or adding at a time. If we make a mistake, it is usually pretty obvious to catch. Later on, when we rely on a number of unit tests already created, most of the UT frameworks are verbose about why exactly a given unit test failed and it makes it easy to understand the problem and fix it.
Popular myths:
- Unit Tests is a phase of testing
- It's really not and it should not be considered as such. But it's quite common to see UT pictured next to other phases, like Integration Testing, UT Testing and other. There are two key reasons why I believe thinking about UT as a phase is harmful: i) writing unit tests is an integral part of working on logic implementation. This is especially visible in TDD. Once written and passing, unit tests are executed on every change, on every commit. They run fast and they are likely to be run many times a day. ii) if already existing unit tests start to fail, that's an urgent problem and it must be fixed as soon as possible. We cannot go to production with any product whose UT do not have 100% unit tests passing.
- Unit Tests should be written by someone else than the person who implements the logic
- No, they are integral part of the development process, done by the same person who implements the logic. In the case of pair programming, this is still done by both people from the pair. Usually the effort requires frequent updates and back and forth switching between product code, unit tests, then the code, then tests, then the code, then tests...
- Unit Tests add large, unnecessary cost
- They definitely add cost, similarly like any other kind of testing. The problem is, this cost is particularly easy to notice, because unit tests can be characterized in terms of their amount, their execution time or the number of lines of code. It is much more difficult to quantify the effort spent on trying to find a mistake in the code written without unit tests. It is much more difficult to gauge the effort spent on investigating bugs injected to the existing functionality and discovered a few weeks later. It is much more costly to find simple mistake through exploratory testing. Nobody is able to gauge these efforts and sum them up together. But everybody can see my 78 lines of unit tests in this example.
Reference:
The files produced during the exercise:
- logic implementation: recently.py
- unit tests: test_recently.py
The video of me repeating the exercise live:
Picture used:
- France in XXI Century - School - public domain