by karlerss on 2/24/21, 8:51 AM with 37 comments
by skrebbel on 2/25/21, 4:17 PM
For those who do this (post author or anyone else here), how do you deal with low-impact bugs?
As a concrete example, we're building a chat toolkit. One customer observed that in some versions of Firefox, when combining a particular set of features in our product, the scroll position wouldn't be remembered. This was 100% a bug. It's also an edge case of an edge case that likely only happened for this one customer, and even there, had a relatively small impact on UX for a small subset of their users. It was essentially a browser bug, and fixing it would require a big workaround that made one component of our product significantly more complex (and thus more prone to other bugs).
With a zero bug policy, we'd have to fix that before shipping anything else. But it made no business sense to do so, very much in the same sense that building a niche feature used by tiny % of customers tends to make no sense.
But once you let that one fly, there's no zero-bug policy left, right? You can just declare any bug as "not important enough right now" and -poof!- zero bugs! Yay, time to ship features.
For context, I'm talking a comparably small, tight-knit team as the author.
by WesolyKubeczek on 2/25/21, 3:59 PM
A) Prolonged discussions about what the exact and precise definition of the word "bug" is, and how whatever that was released last night and caused mayhem between clients and the support was not it
B) Bargaining so my pet stuff is released right after the holidays and I don't look like the unproductive schmuck
C) Using this to nitpick at and get rid of employees someone doesn't like
D) Stack ranking the employees by the number of bugs they let slip past them
E) A full-on war between developers and the QA department
F) Fears to make any progress at all because a bug might creep in
...and of course, anything else you might have seen in your favorite Kafka books, in "Brazil", in "1984", in Lem's "Memoirs Found in a Bathtub", you name it. Of course, in the metrics it's going to look as if the company exceeds any expectations in implementing its zero bug tolerance policy! The managers will work hard on the infographics to show you.
by Alex3917 on 2/25/21, 4:08 PM
IMHO in the long run this saves a lot of time and money. Even bugs with zero user impact can signal some deep misunderstanding about technology, and fixing the problem immediately before it gets replicated everywhere else in the codebase is hugely valuable. Several times there have been cases where there was an extremely inconsequential issue that led to us discovering and fixing all sorts of important bugs that we hadn't even known about.
by tantalor on 2/25/21, 3:57 PM
by mawise on 2/25/21, 6:25 PM
by LegitGandalf on 2/25/21, 6:20 PM
If you accept that Chaos destroys Value (and it surely does), then it is a no brainer to do workflows that find and kill Chaos.
One value add pattern that is really helpful for finding Chaos is using software health metrics to find the echoes of Chaos. Much like how we find black holes by looking for gravitational lensing, Chaos can be found by looking at metrics like software response times under Representative Load, inconsistent response times are an indication of unhealthy contention in the solution (things waiting on other things that are waiting on other things, but some thing is pausing intermittently). Obviously becoming slower over time is also an indication of poor health as well.
Some other useful insights from the Value, Filler & Chaos model are:
• Teams run at 20% value or less. This really has to do with the nature of discovering new, valuable software embodiments. Discovery of new things requires many value attempts, most of which fail, but result in new learning
• Removing unused features is a win because you reduce Filler and sources of Chaos
• Mobile apps taken as a whole run about 1% value (positive revenue), the rest is all Chaos and Filler
• To know if something is Value vs Filler there has to be Traction. Chaos also destroys Traction. The article is a classic case of the team recognizing that Chaos was destroying Traction
by ufmace on 2/25/21, 7:28 PM
It's very possible that this particular team could stand to put a higher priority on fixing bugs before implementing new features. That's ultimately going to be what it is, no matter what they call it. They are free to call it "Zero Bug Tolerance" as long as everybody understands that it's hyperbole and they don't get into endless bikeshedding on what constitutes a bug and if they really should fix it.
It's pretty obvious there will eventually be a bug that's too rare and weird to really troubleshoot, or too niche and complex to bother fixing, or more trouble than it's worth.
by r0s on 2/25/21, 5:20 PM
I came up with this system of coverage which would be a huge improvement and much tighter testing process, eventually moving "left" up the development pipeline:
I proposed this and a bunch of other ideas, and the general reaction was flat. My boss said he didn't understand any of it and cut me off trying to explain.
I realize now, the goal set by the CTO was just talk, they had no interest in any real process change. And so, nothing changed.
The concepts are sound, granted they could be better explained, I'm working on it, just not being paid to do so.
by closeparen on 2/25/21, 5:36 PM
by S_A_P on 2/25/21, 6:02 PM
by benibela on 2/25/21, 6:52 PM
But it gets hard when the users do not cooperate, and there is nothing to reproduce
In an open-source app I only got 3 bug reports in the bug tracker in nearly 15 years. And they were not really bugs either, one question and two https problems. I hope it is because I had tested any change for months and have thousands of automated tests. Or it is because the users do not find the bug tracker.
I do get a lot of mails. They are all useless. Most common is, "There is an error message 'Invalid password'". Then I reply that message comes when they enter an invalid password. Then they do not respond. And then I do not know if there is a bug or whether they have entered a wrong password. Then I also test it for a few hours and see it sends exactly the password to the server that was entered
Another project, a http client. Bug report: "untrusted https certificate" on someone else's server. I try it on my system, and it works fine. Then I ask for their OpenSSL version, and do not get an answer. Now what can I do about this? I try it on multiple computers, and it works on all of them
Another open-source project, much more popular. Bug report: it crashes frequently. Because it much more popular and has competent users, I do not have to do anything about it, the users investigate it themselves. Two months later, the user has extracted the crashing code. They remove as much code as possible, until they obtain a minimal crashing program. It shares zero code with my project. All remains are calls to an open-source library and they report it upstream to the developers of the library. I guess there is nothing to do until they fix it there?
On that project I also get emails. They are also useless, because the competent users use the bug tracker. After moving to 64-bit, I get a lot of "it does not start anymore". Guess they use a 32-bit OS
by juancn on 2/26/21, 4:32 PM
What you have to do, is quickly triage all bugs, and then make a call on when you're going to address it.
Some you fix immediately, some you postpone to a scheduled release, some you never fix, just document them.
Before a release, you set a bar on what bugs are acceptable for release, but you make a call on them, involving quality, product and engineering teams.
A company has finite resources, you need to invest wisely.
by guenthert on 2/25/21, 11:36 PM
Time to market is still a thing and incompatible with overzealous bug fixing. I would settle on a zero-regression policy. Then at least you keep happy customers happy and potentially reach fewer bugs in the future.
Of course it depends on the application. An automotive ABS system has different requirements then an IRC server.
by pwinnski on 2/25/21, 8:45 PM
"In retrospect, maybe the strategy to reach approximate feature parity real fast was not the optimal one."
Or maybe that strategy was, and usually is, the only way to attract paying customers.
How to balance bug-fixes with new feature development is always a trade-off. It's never as simple as "Zero Bug Tolerance." Unless, I suppose, you're writing software for a space ship or deep sea vehicle.
by jjjeii3 on 2/25/21, 9:10 PM
by collyw on 2/25/21, 4:40 PM
Zero code is pretty much the only way you can guarantee zero bugs.