If you’ve read my previous articles on TDD (here and here), you’d know I have complicated feelings about it.
If you’ve read my review of a critically acclaimed novel, you’d know I have major trust issues when it comes to book recommendations.
And if you’ve read my first ever article, you’d know that big names can still have terrible ideas.
TDD isn’t a well-defined practice in the industry. Whenever someone claims their team does TDD, it can mean any of the following:
- We write automated tests (in no particular order)
- We write all our tests upfront, then implement them
- We follow TDD exactly as described in Kent Beck’s book (red-green-refactor)
- We do TDD, but only for bug fixes and/or small features.
This makes it hard to have discussions about it because most people aren’t talking about the same thing to begin with.
I don’t care that much about this phenomenon. Words change over time. If TDD evolves to mean a practice that the industry has developed on its own (without perfectly adhering to the theory), then that’s what it is. When it comes to who’s using the correct terminology, it’s majority rules.
Instead, I want to talk about something very specific, that being the source of it all; Kent Beck’s 2003 book, Test Driven Development : By Example. A step-by-step guide on how to apply TDD, using actual examples and not just theory. Here’s an excerpt from the introduction:
This book follows two TDD projects from start to finish, illustrating techniques programmers can use to easily and dramatically increase the quality of their work. The examples are followed by references to the featured TDD patterns and refactorings. With its emphasis on agile methods and fast development strategies, Test-Driven Development is sure to inspire readers to embrace these under-utilized but powerful techniques.
Beck’s goal with this book isn’t just to demonstrate TDD, but to inspire people to use it by showing how much better it is. Here were my expectations going in:
- A thorough and coherent explanation of the process
- Theory that is well-supported and stands up to scrutiny
- Examples that illustrate its strengths over traditional development
- An honest discussion of its drawbacks
- Well-presented and communicated material
I think that’s reasonable, don’t you?
The Theory
Kent Beck’s TDD process works as follows (paraphrasing):
- Jot down expected behavior in a list
- Pick a behavior and write a test for it in code with your imagined perfect interface (public methods), and watch it fail because you haven’t implemented it.
- Write just enough production code to make that one test pass
- Refactor your code to remove duplication
- Add more behaviors in your list as you discover more about the problem through implementation.
- Repeat.
This is called the “Red-Green-Refactor” loop, and it’s the backbone of the whole philosophy.
Even though tests play a large role in the process, the goal isn’t actually the tests themselves, but the design (though the tests are a bonus). The purported benefits are:
- Better interfaces
- Coupling reduction between components
- Reduction in defects
- Increased test coverage
- Higher quality tests that don’t assume implementation details
His evidence is…anecdotal. Though programming studies are notorious for their methodological flaws, and he admits as much, so I won’t fault him for that.
It’s really hard to tell which design improvements he purports are exclusive to test-first, as opposed to testing at all, because they can all be practiced independently of TDD, and they often are.
But that’s not all, because the supposed benefits are also psychological:
- It gives you immediate feedback on your interface design early on
- It gives programmers confidence when they refactor their code (thanks to a test suite that catches regressions as they happen).
- It breaks down each change into small, manageable steps. (Red means your test isn’t a false positive, Green lets you focus only on passing the test, and Refactor lets you focus only on improving implementation)
- It increases trust within and between teams (because of the extra emphasis on testing)
- It increases motivation during development (because of seeing the test bars go from red to green and reducing monotony)
Let’s break each of these down and show why they’re either incorrect, misleading, or don’t require TDD.
(1) This is probably the most commonly cited benefit of TDD.
The idea is that, without TDD, developers are far more likely to dig themselves a hole with an interface that’s hard to properly test, likely because of logic that’s difficult to isolate. Then when it comes time to test, they encounter difficulties, and rather than improving their code, they slack on testing and push their code through.
This is also why Beck says tests should be written one at a time. Because if you write all your tests at once, then later on have to make major changes to the interface you’re testing, all those tests have to be modified, and the same resistance to change rears its head.
Where do I start?
This all sounds good in theory, but it’s just that; theory.
First of all, there’s the assumption that testing your interface early will expose design flaws more quickly. But in order for this to be effective, the developer has to already know how to write good tests. Otherwise, how would they know their design is hard to test?
But if they already know that, then they very likely have a good intuition on how to write their code such that it’s easy to test. In which case, TDD is just a bottleneck.
Second, interface design is decided by the consumers, not you, the developer/tester.
For APIs and GUIs, that would be the needs of internal or external customers.
And for lower-level components, that’s decided by the implementations of the higher-level components.
I think he believes that by writing tests one at a time, that refactors will be smaller because the developer does it bit-by-bit with each test, rather than making sweeping changes at once.
This is idealistic. I wish it was like this. But any expert developer will tell you that even a single line of code can open up a whole can of worms, requiring a major refactoring.
If you have a suite of tests you’ve accumulated with TDD, then discover something during implementation that requires a major refactor, or your requirements change (which happens often), then guess what? Beck misses this entirely, which is that YOU’LL STILL HAVE REWRITE YOUR TESTS ANYWAY.
(2) This isn’t exclusive to TDD. Tests written later can give the same confidence during a refactor. And I’m not even just talking about tests written after the whole implementation is complete (test-first vs test-last is a false dichotomy). Some people write tests after certain implementation milestones are reached to defend against regressions.
(3) Also not exclusive to TDD. Beck himself has a TODO list separate from his test suite. Manual testing as a measure of progress is just as valid. Then tests can be written later to handle edge cases once the interface has crystallized.
(4) Once again, not exclusive to TDD. High-quality tests don’t require TDD. If TDD is adopted voluntarily by a team, they likely also deeply care about test quality to begin with. It’s self-selection, not a consequence of TDD.
If management tries to enforce TDD, they also have to adopt something like pair programming so that everyone’s work is watched by someone else. I’d hardly call that “trust.” But even so, forcing people to use TDD will just result in half-baked tests. After all, if you have to write test code before production code when you don’t want to/don’t know how to, you’ll write fewer or worse tests.
(5) This is so subjective and Beck provides no evidence. Many would find it irritating and distracting to constantly switch between production and test code with such granular changes in between.
If you’re gonna claim (or even strongly imply) that TDD is superior without hard evidence, while not leaving any room for whether TDD’s success is situational or dependent on the developer, I’ll call you out. The only drawback Beck cites for TDD is that it won’t help you if you don’t care about writing good code. Gimme a break!
Many developers are able to comfortably work in a traditional development workflow. If you claim TDD is superior without qualification, then you must believe that it hasn’t caught on simply because of fear, laziness, or misunderstanding. Unless you have strong evidence, that’s an extremely arrogant stance to take (which tracks).
The Examples
Enough about the theory. Let’s look at how Beck applies TDD to actual problems. There are two examples he uses in the book, but I want to focus on the first one because it’s the most concrete and the easiest to explain.
| Instrument | Shares | Price | Total |
| IBM | 1000 | 25 USD | 25000 USD |
| Novartis | 400 | 150 CHF | 60000 CHF |
| Total | 65000 USD |
| From | To | Rate |
| CHF | USD | 1.5 |
Beck presents a hypothetical bond portfolio system in USD. A new requirement has come in: the system needs to support multi-currency portfolios, with an exchange rate and a total in a desired currency.
Beck starts with this test, and demonstrates TDD from there using Java:
This is a test for multiplying a dollar amount by a scalar value to get a new dollar amount. This is meant to handle getting the total value of a specific bond (price per share * # shares).
I’m not gonna walk through every single TDD step he does, because that would take ages. He continually revises the objects and adds more tests as the solution evolves, so critiquing an early draft wouldn’t make sense. I’m just showing this for context.
But already, something’s wrong. Why are we starting our tests from the lowest-level objects? This Dollar object doesn’t exist, which implies that either:
- This is a greenfield system (not just a new requirement in an existing system)
- There’s some other object in the system that holds dollars (like a Money object)
- That all money operations in the system until now have been operating with raw Java numbers (with the assumption that it’s all in USD)
This is already confusing because we, the readers, have no idea how the higher-level systems expect money to be represented or to behave at a code-level. Beck doesn’t explain this at all, so I can only infer from the scant details he’s described and the implementation itself (which is completely backwards). Presentation is just as important as information.
You can read the 16-chapter long TDD process yourself. But I can tell you that it doesn’t make TDD look good at all. The amount of revisions Beck makes is staggering. If anyone actually developed like this, I’d worry.
Beck also doesn’t provide a final, authoritative version of the code in the book at the end of the section, so rather than piece it together from each step, I found it online. Once again, presentation could use improvement.
Here’s a link to the final code. Take a look and keep in mind the following:
This repo uses modern Java, but the underlying meaning is the sameThe repo organizes classes into folders, but the book does no such thingtestArrayEquals, testFrancMultiplication & testDifferentClassEquality are older tests that were created then deleted later.This book was written in 2003, before constructs like Enums were introduced to Java.Money is supposed to be an abstract class, not a concrete one
Update: Here’s an even better repo for the code. It removes remnants of the previous TDD steps. Also, the Money class was actually refactored to be concrete, with the Dollar and Franc classes being removed in Chapter 10. Apologies for not making this clear.
In the spirit of TDD, let’s look at the tests first. He has tests for the following scenarios:
- Money equality and inequality, with same and different currencies
- Scalar Money multiplication
- Same-currency and different-currency addition
- Sum with another sum
- Sum with times
Good stuff.
But as I looked at the code more closely, I saw so many problems.
What if bank.reduce is called with currencies that aren’t in its hashtable? The Bank class doesn’t handle this scenario
Are negative amounts allowed? There’s no guard for this in the code
Why is money represented by integers? What about cents?
These aren’t just nitpicks. It pokes holes in the whole concept that these scenarios weren’t considered, yet TDD is presented as the solution to bad testing.
But here’s the big one. Why are we using INHERITANCE to represent summations of money? Are you kidding me? I thought this was a joke when I first saw it.
Beck’s inspiration for this came from math, like how “(2 + 3) * 5” is evaluated as:
(2 + 3) * 5
= 5 * 5
= 25
where the operations in the parentheses are evaluated first.
You have a list of shares, the value of each share, the number of each share, and a bank with conversion rates. That’s all we know, so just do this:
No need for a separate object. Just do the work and only complicate when requirements demand it.
Beck has arbitrarily decided to bundle sums of money into an object because…it’s elegant like math? My most charitable guess is that he wanted to bundle the amounts into an expression so that he could delay evaluation until he decided which currency to reduce to.
He doesn’t say this. I’m giving him the benefit of the doubt and reading between the lines.
But this doesn’t accomplish anything that my above solution can’t either. If you can pass around a Sum object, you can pass around a List<Money> too. There’s no need to encode summation in an object. A function does the trick just fine.
It’s not more performant either, because Sum.reduce is recursively defined, so Money.reduce has to be called for each Money in the sum anyway, just like my function.
It’s interesting that there isn’t also a Prod expression for scalar multiplications. If you’re gonna represent math, don’t stop at summations (that’s a joke btw).
It’s because of this attempt at elegance that Bank.reduce has a clunky execution flow.
Take a simple conversion like bank.reduce(Money.dollar(5), "CHF"):
- bank.reduce(Money.dollar(5), "CHF") ->
- return Money.dollar(5).reduce(this, "CHF") ->
- int rate = bank.rate(“USD”, "CHF"); ->
return new Money(5 / rate, "CHF");
- if ("USD".equals("CHF")) return 1;
Integer rate = (Integer) rates.get(new Pair("USD", "CHF"));
return rate.intValue();
Notice how we hop into the Bank class, then it PASSES ITSELF into an Expression object (Money in this case), then we hop into the Money object, then BACK to the Bank class to get the rate? What’s going on here!?
I’ll tell you what’s going on. Beck didn’t want the money-related objects to be responsible for their own conversions, probably because of some rigid idea of the object-oriented responsibilities of each class.
But Expression.reduce already takes a Bank object, so why the unnecessary pass-through?
Phew. That’s enough.
I know there’s gonna be that guy who’s like, “B-b-but it’s just a toy example. The point isn’t the correctness or elegance, it’s just to demonstrate TDD”.
If you’re trying to showcase the strengths of TDD and claim it will increase the productivity and quality of your work, but your process is really cumbersome, badly presented, and results in clunky design, what am I supposed to think as a reader?
There’s another severe flaw that applies to both examples, and it’s something that every zealot does to pontificate their paradigm’s strengths while ignoring weaknesses.
Any paradigm can look good when applied to silly, simple examples. But the true measure of any process is how adaptable it is. What about DB calls, third-party APIs, file operations, GUI, other side effects, etc. Surely these were relevant concepts even in 2003, so why’s this book considered the ultimate guide to TDD?
There are other relatively minor flaws with the book, but I think I’ve said enough. Even if you like this specific practice of TDD,, surely you can admit this book leaves a lot to be desired.
God, I’m tired. But the work is done, and at least I learned the word, “augend,” today.
This wouldn’t be a fair review if I didn’t also talk about the positives.
First, I respect Beck for sticking to his guns and presenting TDD in all its glory; both the beautiful and ugly parts. Many advocates will shy away from discussing the parts that feel like a slog to get through, but Beck is disciplined, and I don’t believe he’s a grifter trying to profit off industry buzzwords.
I also want to commend Beck for something we take for granted, but is impressive nonetheless. When performing operations on an object’s attributes (like Money.times earlier), he favors returning a new resulting object over mutating the original, because he understands the dangers of side effects.
I like that he switches between Java and Python to demonstrate that TDD principles are language-agnostic.
There’s some good advice in Part III about how to set up testing in certain tricky scenarios, and general refactoring advice. I’d recommend people read that, if nothing else.
I think it takes courage to pour so much time into a piece of work you care deeply about, and share it with the public in hopes of influencing them. Even if that opens you up to criticism. I’m sure Beck and many’s experiences have made them feel that TDD was the best solution for their problems, and that’s valid.
I don’t know how much of Beck’s beliefs have changed since publishing, and honestly, I don’t care that much. This isn’t personal at all. I like to criticize works on their own merits.
Anyway, thanks for reading my longest article so far, and have a nice day.
.png)

