from Hacker News

When TDD Doesn't Work

by jodosha on 4/30/14, 12:03 PM with 107 comments

by Nursie on 4/30/14, 1:01 PM
IMHO TDD, like a lot of the agile stuff, is a good idea with solid foundations that people get wrong all the time and end up making things worse with.
Agile was supposed to ease up on process and make teams adapt to changing requirements. It wasn't supposed to use up >30% of your working time just to service the methodology, but that's what it ends up doing when you get in the Agile Evangelists.
TDD was supposed to ensure more correct software at the cost of some overhead (perhaps 30%?) by making sure every unit had its tests written ahead of the code. In practice I've seen it kill productivity entirely as people write test harnesses, dummy systems and frameworks galore, and never produce anything.
A combination of these two approaches recently cost an entire team (30+) people their jobs as they produced almost nothing for almost a year, despite being busy and ostensibly working hard all year. We kept one guy to deal with some of the stuff they left behind and do some new development. When asked for an estimate to do a trivial change he gave a massive timescale and then explained that 'in the gateway team we like to write extensive tests before the code'.
The only response we had for him was 'and do you see the rest of the gateway team here now?'
by yanowitz on 4/30/14, 1:08 PM
Except for this statement, beware of absolutism in statements of How To Do Software Development.
The specifics of this debate are kind of uninteresting because of the (general) lack of nuance from various sides, albeit all informed by their own lived experience.
OTOH, the recurrent reality of <insert topic> debate in our industry is very interesting.
I think it's some combination of:
* a bunch of problems are still unsolved
* software is so powerful that sub-optimal solutions are usuallly Good Enough
* industry amnesia, driven by developer/engineer turnover
* the relative infancy of the industry, especially as a function of the rate of change (I'm not sure how you would normalize for rate-of-change, social structure and communication speed, but it would be interesting to compare these debates to medieval guilds in Europe).
* ???
To take up the first two above:
Things are better than they used to be -- as late as the 90s, code reuse was still an unsolved problem. Of course, code quality is still hard--we are reusing broken code, but at least we "only" have to fix it once.
I think it's hard to overestimate the importance of Good Enough as a factor in these recurring debates. Everyone can be right from the business's point of view--tons of money is still being saved. Once you get past the initial ramp of a company, how to structure for continuing velocity of a team and make headway in your chosen market(s) seems like a different optimization problem than what got you there (again, not a new topic!)
Just some partially formed thoughts...
by grandalf on 4/30/14, 1:47 PM
While DHH's rant has spawned an interesting discussion, it feels to me like he's arguing in reverse in defense of his framework.
Many Rails apps are tightly coupled, and many unit tests written by developers using rails test 10% program logic and 90% framework features.
Of course this is going to be slow. We can argue about hacks to make it faster but at a certain point it's a problem whose solutions start to distract us from solving the important problems.
If you have a web app and get to write a single test to determine whether it's safe to deploy, that test would be an integration test.
The decision to write tests more granular than integration tests is a decision to be made based on assumptions about the rate of change of components of the system.
TDD is tangential to the above observation.
There are many cases where an implementation is easy to figure out (though possibly time consuming) while the optimal interface design is less obvious. TDD can be really useful to quickly iterate interfaces and verify that all the moving parts work as expected together, before worrying about the implementation details... This makes it possible to work on a larger system with more focus on problem solving, fewer mistakes, and less overall cognitive load.
by programminggeek on 4/30/14, 2:06 PM
I have a really crazy idea on how you could test the GUI in a way that would both save time and provide something more valuable than just testing true == true.
My idea is to create tests that take screenshots and do a diff of them over time. You then could set a change % threshold that would signal a test "failure" signal. Your QA team could then run through that process and see that something significant changed. Maybe that is fine, but maybe that is hugely unintended.
Having a Time Machine of screenshots of different processes, you could compare changes easily and see if they are worth further investigation. For example, this would be useful if you change some CSS or JS for just one page, and it ends up breaking another page.
The key point of this system is not that it would tell when your system is broken, but rather that there was a significant change that occurred that might have broken something. It's not a substitute for human analysis or thought.
Is anyone doing something like this and would it be useful to anyone else?
by lnanek2 on 4/30/14, 1:03 PM
This isn't really true. At smartphone OEMs we certainly do have boxes to put the devices in that perform physical tests like on the touch screen, microphones, speakers, antennas, etc.. And in mobile development we have UI automation tests that confirm buttons are certain color, have certain text or state, the right screens popup when pressed, etc. - heck we have a program called the monkey that presses everything that can be in addition to the UI automation scripts. I know the web side of things has Selenium and similar robots. I think he just hasn't ever worked somewhere where everything is tested which is reasonable. In many cases you have nothing to do with the OS your software is running on, for example, so there isn't as much point in testing beyond what your app outputs to it.
by jshen on 4/30/14, 3:01 PM
"Over the years many people have complained about the so-called "religiosity" of some of the proponents of Test Driven Development. "
And Bob is guilty of that religiosity himself. See here https://www.youtube.com/watch?v=WpkDN78P884&feature=youtu.be...
Jump to the 58 minute mark if it doesn't automatically.
by planetjones on 4/30/14, 1:12 PM
>> So near the physical boundary of the system there is a layer that requires fiddling. It is useless to try to write tests first (or tests at all) for this layer.
Maybe I can see what he's trying to say, but I don't think statement alone is accurate.
For the GUI i.e. at the human boundary of the system, the most value (especially to stop regression and catch side affects) is often added with tests e.g. automated tests which perform some user function in the GUI and assert the results.
Another physical boundary of the system is a database. Writing tests which cross this boundary add a lot of value too.
I'd favour these tests which hit the boundaries and go over them, over a codebase with only unit tests and endless mocking any day of the week.
Also these type of tests can be written first. We do it.
by nsfyn55 on 4/30/14, 2:59 PM
This article is a breath of fresh air. I can't how many times I've encountered the dogmatic TDD adherent. I write more tests than anyone I know and what I have learned is TDD is great except when the cost of TDD outweighs its benefits. I've seen a dev spend 8 hours fiddling around with Mocha/Chai trying to test if a button changes color in response to a successful callback. Sometimes its good enough to click the button and see if it changes color.
by nimblegorilla on 4/30/14, 3:47 PM
We've seen most of this argument before, but the most interesting (new) part of the article is the implication that CSS is the final layer between software and the physical world and thus hard to test. I'm sure the people on the Mozilla, Chrome, Opera, and IE projects would disagree that CSS is untestable.
It seems Uncle Bob implies that it's ok to skip TDD if you think it is hard to test something. There are much better reasons for most apps to avoid testing their CSS. Likewise there are many reasons for some projects to have extensive automated testing around CSS even if it might be hard.
by asgard1024 on 4/30/14, 3:36 PM
"I have often compared TDD to double-entry bookkeeping."
It always seemed to me, if you were to make perfect, automated tests that 100% cover your application, you would basically have reimplemented it. (Or in other words - if you want to check if your calculations are correct, you have to do the calculations again.) Ideal, fully automated tests are basically taking two implementations, run them side-by-side, and compare the results.
That's why I am not a big fan of tests, in the sense that there is too much focus on them in the SW industry, and they seem like a hammer (useful but overused).
I think there should be more focus on writing the _one_ implementation correctly. This can be done with better abstractions (e.g. actor model for concurrency, functional programming, ..) and asserts (programming by contract), and maybe even automated SW proving. I don't think these techniques are as popular as testing, but I wish they were more popular, because they let you write programs only once and correctly.
[Update: To specifically expand on point about asserts, if you can trivially convert test to assert, why not do it? Unfortunately tooling doesn't support asserts as much as tests.]
by williamcotton on 4/30/14, 3:02 PM
A good tool for testing visual interfaces is an image diff combined with a manual testing process.
Initially the tester will view each screenshot of an application state that is being tested and set that as "passing". Next an automated test runs and the latest screenshots are compared to the passing screenshots.
If they are different then the test fails. A manual tester then needs to take a look at the tests that failed and decide if the test actually failed or if the changes were supposed to be there.
If the changes were supposed to be there the tester can make this image the new passing screenshot. Passing screenshots should probably be reset BEFORE the tests are run. I see no reason why not to just check these images in to the repo along with all of the other test conditions.
I've been scheming on ways to do video diffs for testing transitions and animations although I'm not sure if this provides much value. It would be mostly an academic pursuit.
by mbrock on 4/30/14, 1:50 PM
There seems to be two aspects to this discussion.
One involves questions of process, enforcing TDD, and whether or not TDD can save a bad team from producing bad stuff. The other is the question of what TDD can offer for a skilled team with an intelligent approach to development.
Mixing these aspects leads people to dismiss TDD because they've seen teams fail by doing TDD in a bad way.
Another question: is there a way to structure software so that questions of boundaries and collaborators become less troublesome for testing? I think a promising road is in value-oriented programming without side effects.
Another way to see that: if you need a lot of tedious mocking to test your unit, maybe the unit should be redesigned to have fewer collaborators, or maybe you should move the complex logic to a pure function, and so on. Maybe TDD difficulties are showing us that there is something wrong with how we write code. After all that's what it's supposed to do.
by lclarkmichalek on 4/30/14, 1:02 PM
So you're telling me that being pragmatic will result in sensible solutions? I could do with more blog posts like this!
by dllthomas on 4/30/14, 3:15 PM
"So near the physical boundary of the system there is a layer that requires fiddling. It is useless to try to write tests first (or tests at all) for this layer. The only way to get it right is to use human interaction; and once it's right there's no point in writing a test."
This seems dead wrong. There is probably no way to write tests first in this environment, but with so many different browsers interpreting your CSS (to run with preceding example) you need to be aware of when changes in your code cause changes in rendering that might need to be revalidated and further fiddled! I do agree that it doesn't fit well with TDD, but it absolutely can work with automated testing.
by DanielBMarkham on 4/30/14, 1:59 PM
<standard TDD comment>
"...software controls machines that physically interact with the world..."
See, that's not always true. I would love it if all software interacted with the outside world. But a lot of software doesn't interact -- just take a look at some of that code sitting in your repository sometime. Some of that isn't deployed, isn't being used. You could test that until the cows come home and have a whole bucket full of nothing.
Because the Bobster and the other TDD guys are correct: you gotta test to know that the code is doing what it's supposed to. Testing has to come first. In a way, the test is actually more important than the code. If you get the tests right, and the code passes them, the code itself really doesn't matter.
Where we fall down is when we confuse the situation of a commercial software development team working on WhipSnapper 2.0 with a startup team working on SnapWhipper 0.1. The commercial guys? They are working on a piece of code with established value, with a funding agent in place, with a future of many years (hopefully) in production. Everything they create will be touched and used over a long period of time. The startup guys? They've got a 1-in-10 shot that they're alive next year. Any energy they put into solving a problem that hasn't been economically validated is 90% likely to be wasted.
Tests are important, but only when you're testing the right thing. The test for the startup guys is a business test, not a code test. Is this business doing something useful? If so, then just about any kind of way of accomplishing that -- perhaps without any programming at all -- provides business value.
That's a powerful lesson for startup junkies to assimilate. In the startup world, you don't get rewarded based on the correctness or craftsmanship of your code. You're looking at one or two weeds instead of realizing the entire yard needs work.
Put a different way, we have Markham's Law: The cost of Technical Debt can never exceed the economic value of the software to begin with.
</standard TDD comment>
by robmcm on 4/30/14, 1:12 PM
I have never found it to work in visual/interactive development. A lot of the time you are working on something you evolve as you develop, try, iterate again.
I can see it's benefits if you have a simpler I/O for your code.
by pornel on 4/30/14, 2:43 PM
TDD for CSS is indeed an odd concept, but it's possible to do automated CSS regression testing:
http://tldr.huddle.com/blog/css-testing/
by harel on 4/30/14, 2:12 PM
As a non religious person, there's one thing I don't get in the whole to TDD or not to TDD debate that's ongoing now.
Does it matter if "TDD says this or says that"? Aren't these methodologies more of 'suggestions' for us to adopt as it fits our needs, while trimming the stuff that doesn't? Once you adhere to a methodology religiously you lose the flexibility and pragmatism that methodology intended to give you. It just becomes systematic Dogma following of a rule book, like any religion.
by dllthomas on 4/30/14, 3:06 PM
"How can I test that the right stuff is drawn on the screen? Either I set up a camera and write code that can interpret what the camera sees, or I look at the screen while running manual tests."
Or screenshots, of course, which is still relying on code obviously but probably not your code. Of course, defining what you're looking for in a screen shot is going to be nontrivial unless you're doing simple check that it hasn't changed from the last manually-approved version or something.
by platz on 4/30/14, 2:19 PM
I am reminded of this somewhat recent discussion when TDD doesn't work (which I believe uncle Bob responded to as well)
https://news.ycombinator.com/item?id=7130765
by tempodox on 4/30/14, 2:58 PM
Why is it that we need articles like this to tell us that there is no silver bullet and we had better use our brains instead of the Methodology Du Jour? Sometimes, HN just makes me want to cry...
by dllthomas on 4/30/14, 3:04 PM
"However, software is different from accounting in one critical way: software controls machines that physically interact with the world."
Doesn't accounting?
by pjmlp on 4/30/14, 1:15 PM
So the acceptance that there are lots of scenarios where one cannot write tests first.
A good post for the TDD evangelists I have met so far.
by raverbashing on 4/30/14, 1:02 PM
"But if I want to be sure that the bell rings when the proper signals are sent to the driver, I either have to set up that microphone or just listen to the bell.
How can I test that the right stuff is drawn on the screen? Either I set up a camera and write code that can interpret what the camera sees, or I look at the screen while running manual tests."
I have nothing else to add