from Hacker News

We invested 10% to pay back tech debt

by hanifbbz on 1/15/23, 10:33 PM with 222 comments

by kace91 on 1/16/23, 12:04 AM
I’ve seen “x% time for tech debt“ rules in several companies and sadly it didn’t work too well.
Since the problem was the culture of continually pushing half baked features in the first place, the rule was quickly corrupted: people would design a good system, throw anything that’s not required for a POC into the tech debt backlog and deliver a barely functioning version.
“This is a technical debt task” was used to prevent everything that wasnt new Features taking time of the other 90% of the sprint.
Basically, if you assign a block of time to quality, you risk people taking that as an excuse to not focus on quality outside that block.
by echohack5 on 1/16/23, 12:28 AM
Having been in this space for 25 years now, here's my take:
Tech Debt is almost always a few things in disguise:
1. Product Failure - The person with the final say on the product has poor taste. They poorly understand the tradeoffs their tech stack gives and how it influences their product. And most importantly they hate their customers and don't understand their needs.
2. Marketing and Sales dominance - The product organization might be competent but they have been driven out of the decision making venues in the company. So marketers and sales make the decisions and hand them down to product to deal with, resulting in tone deft decisions even as the company continues to make good money, the product itself erodes, until competitors arrive and instantly wipe out the business.
3. success and shift - The product, which was initially a success, has stagnated. The product itself was so successful that it created an entire new market with competitors that resulted in commodifying the product. Now the company is frantically looking for a pivot to keep growing the business, resulting in the original successful product drying up even faster.
4. Leadership void - The product was made by a strong, pioneering leader. They might not even consider themselves such a leader, but after they move on, the product fails without their support. The replacement leader might not even be bad at running the business for a time, but eventually focusing on EBITA alone won't inspire people, and the product will erode through churn.
5. Press Release driven development - The company operates by making a moat around its original core offering and they have a semi-monopoly in their space. So the only way to drive more revenue is to sell services that cost more to already existing customers. As a result, teams build products that are made to drive hype cycles and press releases -- once the product is shipped, the major players get promotions and move on to the next exciting product instead of supporting the now-shipped product.
by hooande on 1/16/23, 12:34 AM
> "You see, the leadership did not care about the code quality as long as the stories were delivered on time."
My problem with the concept of technical debt is that "code quality" is subjective, and more often than not it translates to "I don't like the structure of this code because I didn't write it". What matters to the business is that the code works. The owners of the business do not care if the developers enjoy working with the code or if they find it to be well written.
I hear all the time that bad code slows down development, increases bugs, etc. And that can be true in a number of cases. But most of the time it's just complaining. If a developer takes the time to re-factor the code base, it's very likely that whoever takes over from them in a few years will have the same complaints about technical debt and want to re-factor all over again.
The best way to ensure everyone's job and increase compensation is to deliver features that customers really want. Doesn't matter what the code looks like, as long as the customers are happy. Re-factoring is important to developers and to maintaining an organization. But it can wait until there aren't pressing feature demands.
by r3trohack3r on 1/16/23, 1:26 AM
I really wish we would stop calling it technical debt. Every team/org I’ve worked in with “tech debt” issues has had very tactical problems that could have been communicated, invested in, and solved. But instead the org talked about “tech debt” - an immeasurable boogeyman that anyone outside of the engineering org has no grasp of and, most importantly, management up the chain to the C-Suite have decreasing mental models of investment/pay-off the further up you get.
Teams saying “tech debt” are perpetually under funded and under appreciated.
Instead, speak a language your management chain understands.
* These specific services have outgrown their architecture and back pressure keeps outgrowing their current scale, we need to invest in a more reactive architecture. It’s going to cost 3 teams 1Q and we will prevent N outages based on historical data.
* In 2023, engineers far fingered the deployment of these services N times causing various levels of service outages, one made the news, we need to invest in guardrails in our CI/CD to prevent that. It’ll cost one team 2Q and we will prevent N outages.
* We had 4 employees across our engineering org quit last quarter because holding the pager burned them out, we need to stand up a tiger team that can help kick our metrics into shape.
Speak a language your management understands. Speak in terms of delivering features (feature velocity), reliability (outages), employee retention, hiring through resume driven development, etc.
You’ll find you’re negotiating in a positive sum game if you do this. You give me 1 unit of investment for this problem this quarter and I’ll give you 1.3 units of return next quarter. And maybe there are greater returns elsewhere so you aren’t making a competitive bid and that is okay, or maybe your management will invest in you and you just signed yourself up to deliver 1.3 units. But don’t handwave and ask for budget.
by andrewstuart on 1/15/23, 11:48 PM
I might be doing it wrong but I constantly pay down technical debt to almost zero. The reason I do this is because when I am working on an area of system it takes alot of thinking to understand it, thus I try to get as much done as possible in that context, because re-learning the context later is very expensive in terms of time.
I'm pretty certain that means new features are implemented alot more easily. Hard to quantify though.
I can say I'm always thanking "yesterday me" for finishing the job and cleaning up whatever "today me" is working on.
It's possible for me to do this because I do most of my programming on my own. I don't think this approach is suitable for most software engineering processes because they tend to deprioritise technical debt.
by llamaLord on 1/16/23, 12:25 AM
As a PM, can I just say that your PM is fucking useless!
Your PM is meant to be commercially minded, meaning they should understand the concept of compound interest.
Every new feature you build, on average, increases the value of your product linearly.
Every major piece of tech debt incurs reduced efficiency that compounds. It is extremely common in older systems for the incremental value of resolving tech-debt to be HIGHER than the incremental value of shipping "another feature".
by goto11 on 1/16/23, 7:19 AM
The only solution I have seen work: Paying back text debt should be an integrated part of regular development rather than a separate task. When you change or add a new feature in an area of the code, you clean up tech debt in the area, at least to the level where it is better than it was before.
This also avoid "moving around deck-chairs"-refactoring since the refactoring are coupled to specific development tasks. You refactor to make implementing the task easier and cleaner - no more, no less.
If some area of code is "ugly" but works and don't need any functional changes, leave it be.
Scheduling tech-dept-payment separately runs the risk of getting de-prioritized. If a deadline is approaching or the company need to cut expenses, I'm sure the dedicated "tech-debt-payback" time is the first to get cut, "temporarily".
by xyzelement on 1/16/23, 12:42 AM
The "X% for work engineers want to do" approach always strikes me as a poor substitute for communication and sober evaluation of the value of work.
If I am a PM, and my team's velocity is badly bogged down by some tech debt, then the right allocation to fixing it is 100% (fix the shit so we can go fast forever.)
On the other hand, if some "tech debt" doesn't actually impact team velocity/clients (eg, some code is "bad" but it's in a part of the system that's never touched) the right allocation is 0%.
There's ultimately the only thing that matters - getting value out to the customers. Tech debt only matters to the extent that it gets in the way of that, so prioritizing it vs features is easy because at the end it's still about "what do the clients get, when"
The X% approach seems to happen when engineering and product fail to have that conversation, fail to understand each other, so they have to just get a flat allocation each.
Suboptimal.
by cratermoon on 1/15/23, 10:54 PM
I've seen it said that in any ongoing work on a large and complex software product, about 20% of the time is spent on upkeep just to keep the system from degrading further. Probably your team was already spending a portion of their time shoring up the worst parts. That's typical, and why the build up of problems slows down feature work.
I always plan on spending some percent of my time on work I know I'll have to do before I can start adding the new features. Regardless of how I'm asked to estimate, I keep in mind the cleanup work in accounting for the total effort.
Unfortunately there are still a lot of PMs and folks, including some programmers, who aren't aware of or don't understand the need for structural maintenance. Those are the teams I regularly see start out going very fast but within a few months get bogged down in having to work in the awful system they built.
by nisa on 1/16/23, 12:02 AM
Good article and a good inspiration to debate this topic. Honestly I already get in a bad mood when I read this conversation with the PM - this mentality "We need to timebox that activity so it does not swallow the time that should go to features and bug fixes." - depending on the tech debt and state of the codebase devs are wasting enormous amounts of time working around stupid issues (i.e. tech debt) and even a little bit of team effort might help - there is maybe some company out there where someone argues for rewriting the almost bug free haskell code to use higher kinded types and lenses because at the moment it's not so looking well but I guess in reality often there is just a lot of shitty cruft in java/javascript land that a few weeks of thinking and fixing (in the team, with lot's of discussions) improves much more and gives more time for features than avoiding that.
by ramesh31 on 1/16/23, 12:24 AM
Maybe I'm an un-organized mess. But there's no such thing as "x%" of my time. Theres's just time. And I'm either focused on doing something or I'm not. Management loves to think that they can parcel out time in that way. But it's a complete fantasy.
by hbrn on 1/16/23, 6:10 AM
Tech Debt Friday worked for this team, good for them. It will probably not work for another team.
Tech debt is never ending discussion. But ask yourself: why is almost every team struggles figuring out how much time to dedicate to tech debt? Why is there a constant tension between engineering and management?
The answer is simple: engineers and business people don't understand each other. We live in different worlds and speak different languages. You can't solve this problem with a methodology. If Tech Debt Friday or Google's 20% works for you, that's just luck (or wishful thinking).
Once you understand where the problem is coming from, the solution is also simple: find someone who speaks both languages and trust them to decide. Typically that person is a product engineer: it's easier to explain business to an engineer than engineering to a business person.
And when I say trust, I mean both engineering and management should do it. I.e. if that person says a couple of outages is not a big deal, it's not a big deal. If that person says we need to spend next week refactoring, then that's what you spend your next week on. Obviously you can still challenge those decisions, but you have to accept that this person is an expert in their domain. They don't know everything, but they know more than you. Even if you're the CEO. Even if you're coming from FAANG with 20 years of experience.
That's why companies where founder is a product engineer typically don't have a tech debt problem (they still have tech debt, they just don't have a problem with it).
I know it sucks to not have a mathematical or managerial solution to tech debt, but the tech debt is inherently complex and humane. Tech debt happens when humans are solving problems for other humans. The only solution to it is to have another human in between. The quality of that solution will depend on the quality of that human. It's not going to be perfect. But it's the best you can do.
All other approaches are just shots in the dark.
by hashmap on 1/16/23, 12:39 AM
How often do products go under because of unresolved tech debt? Is it many? When I first got into software I spent a lot of time on refactoring and cleaning up code. As time goes on I find myself doing that less and less, and just focusing on work that produces meaningful results. Fixing tech debt, from what I can tell, is less meaningful than conventional wisdom says it is.
No software will be perfect. Eventually it will die and something will replace it. I think knowing what "good enough" is, is perhaps the more important capability. Larger fruit hangs lower than tech debt the vast majority of the time.
by romanhn on 1/16/23, 5:53 AM
This was a failure of product management and, doubly so, engineering management. A PM working in software absolutely needs to understand the concept of technical debt. Not necessarily to know where the debt is on their own, but to work closely with engineers on understanding and managing it along with product functionality. That said, I reserve most of my scorn for the eng manager. This is the person whose duty it is to say that yes, in fact 20% is the right proportion for tech debt work after it's been neglected for so long. This is the person to stand up to leadership and explain why this is necessary.
Autonomous teams with bottom-up decision making are much more likely to push towards the right thing even with clueless management, whereas there is little hope when incompetent decisions are fed to the team in a top-down manner (yes, autonomous teams indeed exist - though they are typically a sign of competent management).
by theptip on 1/16/23, 12:42 AM
I like this approach, and have found it useful.
One other trick I used, is to have a “tech debt week” at the end of every quarter. 12 weeks of coding per month, then a week for the managers/PMs to evaluate the last Q, and plan the next one. At the same time, while there is this awkward 1wk window where the plan is in flux, the engineering teams can focus on polishing their tools and attacking tech debt that might take more than a day every sprint to make progress on.
Of course, this planning window is probably too short a window for big companies, and too much planning overhead for one-team startups. But for 20-30 person engineering org this cadence worked well.
by tanin on 1/16/23, 5:07 AM
Sometimes I feel I'm the only person who thinks tech debt is not that difficult to solve.
Everything needed to be solved is under your control. It may take time and be boring, but you have everything you need to solve it.
Product-market-fit, customer acquisition, and etc. are often much more urgent/difficult to solve, and we should focus on those first.
What I've seen in a lot of teams is tech debt (or speculative tech debt if we move quickly) is exaggerated to the point that they cannot launch quickly to acquire and iterate with customers, which is a huge mistake in product development.
by GuB-42 on 1/16/23, 12:14 AM
Before clicking on the link, I expected failure. Turned out it is a success story, and I am a bit disappointed.
I have seen "code cleanup" that made things worse countless times. Especially if the idea is to dedicate time just for that, instead of doing it as you go. I expected the article to be about one of these failures.
The article describes 7 points, and suspiciously nothing goes wrong. It is rarely how it happens, there is typically some trial and error involved, and knowing what failed is as important if not more so than the final solution that works for you.
by gerdesj on 1/16/23, 1:18 AM
Tech debt is simply another debt that a company or org needs to deal with. It's far cooler for a tech company to worry about their funky no vowels thing than say the rent.
My boring little IT company in the UK owns its premises - roughly an acre in a town in Somerset with a 19m long edge red brick two floor building. The property costed us about £240,000 - the mortgage is cheaper than the rent we paid on part of a converted stables really out in the Styx.
Now with a mortgage there is the fine print. In the UK it is normal for a bank to require a "debenture". Your Loan to Value is below a percentage then the bank can intervene at any point and take over, which is what at least one bank in the UK did to try and shore up their finances when it all went south in 2007ish.
There is another thing called an "Overage Clause". That's where the vendor wants to make a hold over future profits. So we bought our place from the NHS and they sought an overage in perpetuity (for ever) on any profits we might make on selling the place. We negotiated on 10 years. That expires this year. With it, we would have to hand over 50% of any profits from sale.
Running a company is not rocket science but it can be tricky. My little firm will never be a unicorn or even a mare with a cornet on its nose. I don't care.
by kitanata on 1/16/23, 5:31 AM
10% is under investing and this is a bad methodology. How do you know what to work on? Is tech debt just the stuff that annoys you or is it actually bad?
What you need to tackle tech debt is a quality program with metrics. I’m talking code coverage, cyclomatic complexity checking, linters and scanners, DORA, SAST, DAST, etc. Quantify your quality. Then quantify the risks and costs of not improving it.
Then you need to target the areas of code your tools tell you to and you need to make a conscious effort to solve those very specific things. “Module A’s complexity score is 26. Our standards say this needs to be 10. Therefore this is considered a quality item. Therefore it goes into the sprint as a strategic investment.”
Software and business leaders, when developers talk about tech debt… they are talking about managing complexity. (Shit breaking all the time because you don’t have tests in a complex system. It’s failing because of the complexity.) High complexity is expensive. If you do not balance the need to manage complexity against features and you do not act intentionally about your quality your software will eventually fail or if you’re lucky it will just reach a stage where you can’t maintain it anymore and you’ll scrap and rebuild.
Investing in quality makes you go zoom.
by floatinglotus on 1/16/23, 12:19 AM
You’ll never end up with any technical debt if you don’t ship.
by gumby on 1/16/23, 12:17 AM
IMHO this was the best part of the (interesting) post:
> Having dealt with tech debt in a collaborative manner, enabled us to do the “regular work” faster because we had a better collective understanding of the code, and the code was cleaner to work with. One could argue this is just a positive effect of mob programming, but the lack of a concrete agenda also helped the autonomy that unlocked creativity.
by zmmmmm on 1/16/23, 12:19 AM
The types of tech debt described don't really match well to the real problem ones I see. These are major decisions deferred / delayed that never got rectified and the cost has risen exponentially to address them. To get them done requires a total stop-the-world type effort with simultaneous migrations across multiple teams / services / products.
by carl_sandland on 1/16/23, 1:15 AM
There's no such thing as 'technical debt', just call it crap code (or a 'hack') and face the impacts of having to live with it. Its been my experience that these things get logged to Jira as 'medium' or 'low' and then of course never get looked at again, and if they ever get opened up its likely the code has moved on. This leads to the following idea;
1. code the MVP the customer accepts (customer is happy) 2. go ahead and create all the debt tickets (makes you feel professional) 3. every xmas just delete all non-high tickets older than 12 months.
If the code is shit enough, it will die a natural death (most of today's wiring type code has very short TTL anyways) Similar to "if a tree falls in the forest...", "If a customer doesn't notice...". We do so many 'invisible', 'hard' things in our line of work, it's almost an impossibly thankless road to build quality into our systems.
by rubyist5eva on 1/16/23, 12:59 AM
You don’t have dedicated time for technical debt. You make fixing technical debt part of your estimates for new features.
by glintik on 1/16/23, 8:36 AM
Why not 15%? Or may be 5% will be enough. Actually, it’s a bad idea to plan time in such manner. If devs need, they should be able to refactor/fix things during normal development process without any “just 10%”.
by fuzzieozzie on 1/16/23, 1:28 AM
I have communicated tech debt very effectively as cleaning a house - "If you do not regularly clean up you're going to end up in a world of hurt!"
Warning: your house might end up looking like a hoarder's house!
by llsf on 1/16/23, 12:25 AM
I had started the "Documentation afternoon" on Fridays in my team. And everyone seemed to like it, to wind down the week with uninterrupted time for documentation (email catch up, expenses, etc.)
by namuol on 1/16/23, 1:53 AM
Dedicated budgets for tech debt only serve to justify decisions to cut corners in the first place. It’s not a prioritization issue or a matter of smarter sequencing. It’s a cultural problem.
by Scubabear68 on 1/16/23, 12:33 AM
I would argue that if they were able to pay down the tech debt that quickly and easily, then most likely the debt wasn’t as bad as described to begin with.
by DangitBobby on 1/16/23, 12:08 AM
This environment sounds stifling. Very glad most of the time there's no one trying to inject tickets between me and fixing broken shit.
by SideburnsOfDoom on 1/16/23, 9:01 AM
> PM argued: So, you are telling me that we must spend 20% of our time just to keep the lights on?
Yes. If the "keeping the lights on" work has been neglected for long enough then yes, in fact 20% is a lowball number.
The context is not so much "keeping the software running" but "reducing the friction in further changes to, and deployments of that software"
by vitriol83 on 1/16/23, 11:14 AM
In places where there is not much time for code refactoring, the following is helpful:
Imagine an idealised future state of the codebase, which everyone buys into, and make sure any new feature is going in that direction.
Refactoring existing code can be death by a thousand cuts- having a parallel new codebase which is incrementally adopted can be more efficient and quicker to market.
by hendry on 1/16/23, 2:43 AM
People talk about tackling tech debt, though I want to see results in the form of reduced/stable lines of code!
The author mentions 180k of code... but did the team actually chip away at that SLOC thanks to "Tech debt Friday"?
https://github.com/kaihendry/graphsloc is how I track projects.
by pictur on 1/16/23, 6:08 AM
I think the concept of technical debt is a kind of corporate culture. it can be repaid over time, but it's likely to reappear unless you reduce the things that caused it. that is, of course, it cannot be completely eliminated. but for example, things like serially redundant feature development can be reduced.
by bob1029 on 1/16/23, 1:10 AM
I think using financial terminology to describe technology problems is a mistake.
There is no such thing as tech "debt". Or, if there is, it's entirely subjective.
I've seen things described as a "6+ month major rewrite" that another developer could address with careful, incremental enhancements over a matter of weeks.
by romanhn on 1/16/23, 5:45 AM
This was a failure of product management and, doubly so, engineering management. A PM working in software absolutely needs to understand the concept of technical debt. Not necessarily to know where the debt is on their own, butand to work closely with engineer
by revskill on 1/16/23, 3:01 AM
The only reason i raised issues about tech debt in previous companies is because i don't want to inherit from a messy codebase created from the formers.
Even with a good test suite, you can go very far. Clean test (interface) is more important than clean code (implementation)
by andrewfong on 1/16/23, 7:31 AM
The interesting aspect IMHO is not that they set aside 10% of their time to fix tech debt per se, but that they set aside 10% of their time to collaborate on it / engage in mob programming.
by vb-8448 on 1/16/23, 12:19 AM
> Decreasing MTBF (mean time between failure): the software fails more often and there are increasingly more incidents.
Is this actually true? In my experience, in the long term the failure rate tends to zero
by snissn on 1/16/23, 12:40 AM
This strikes me as very smart -
The first rule is not to create tech debt in the first place. The PR (Pull Request) that creates tech debt should come paired with the issue to deal with it.
by hanifbbz on 1/15/23, 10:33 PM
Story from a job I had couple of years ago. At the time I took it for granted but I haven’t seen something quite like it before or after.
by logicallee on 1/16/23, 12:54 AM
Hey I have a question for everyone here. Is it possible to go camping without being obligated to build a house at the same site?
by asmyers1793 on 1/21/23, 2:05 AM
I'm starting to check out grit.io for technical debt help
by adityaathalye on 1/16/23, 8:10 AM
Simple debt is a poor metaphor. Jenga tower is a better one. An even better one is "Jenga tower of subprime debt" [1]. Xkcd nails it as usual: https://xkcd.com/2347/
The author writes:
> "To their credit, I came in when the code was like a crumbling Jenga tower"
Structurally bad systems look a lot like that. Small misalignments and nonlinearities compound to make the structure vulnerable to a sneeze.
Relatedly, I feel it is not so much what % was applied, but what targets were de-risked. Things that tend to matter include mean time to recovery, time to market from _planning_ to deployment (not just commit to deployment), defect rate (e.g. hotfixes and rollbacks). If the superstructure is good, one can start with cheap panel walls and gradually swap them out for nicer ones with better properties.
[1] cf. a post I wrote to ruminate that angle: https://www.evalapply.org/posts/software-debt/index.html#mai...
edit: add source
by ochronus on 1/16/23, 5:54 AM
On a related note, I've written about how to tackle tech debt - dedicated capacity/time is just one approach. https://leadership.garden/tips-on-prioritizing-tech-debt/
by trevorLane on 1/16/23, 4:51 AM
teams should push for a 70/30 feature to tech-debt ratio. product is so good at feeding the dev machine, devs rarely get time to evaluate the landscape and push back.