In software engineering, technical debt is an otherwise useful metaphor that has metastasized into a cancer, misused more than used. As an experience software-engineer, I thought I’d write up why.
What is technical debt?
Software constantly changes/evolves. That’s why we have version numbers. Engineers are constantly making changes, small changes here and there, sometimes major rewrites or addition of functionality.
Bad decisions (such as shortcuts) made in one release can have long term consequences in later releases by making change harder. It could be bad design making code brittle, such that inconsequential changes can easily cause big bugs. Or, it can be spaghetti code, making it difficult for programmers to read. Until such problems are fixed, there will be an additional cost for every release of the software. It’s a constant management decision, juggling how much effort is spent cleaning up old stuff and how much is spent adding new stuff.
This debate has existed since at least the 1970s, and there are many names and analogies for it. It’s like a shipping channel filling up with silt needing to be dredged on a regular basis. Or, we say that cleanup efforts can pay long term dividends. Or, an ounce of prevention is worth a pound of cure. In my early days of programming, we called it “code deficit” — the things we needed to fix later.
One analogy has been more successful than the rest, comparing it to monetary debt. A shortcut is like a loan, it helps you release a version of the software faster, but incurs long term interest payments until the principle is paid back. The call this technical debt.
Why is it a bad analogy?
The problem is that people misunderstand debt, believing it’s morally bad. This converts the concept from measuring costs-vs-benefits to a moral crusade. Instead of quantifying the costs, people regularly use the term technical debt to refer to things that must be fixed, regardless of the costs.
The term is commonly used as part of some Holy Crusade. For example, the cybersecurity industrial-complex blames bugs in software on moral failures, sloth, villainy, greed, laziness, ignorance, pride. They regularly use technical debt in the moral sense, that the code is evil, divorced from its original intent, identifying changes that have long term dividends. It’s a staple of the “thought leader” style of speech where they crusade for the moral duty of making everything more secure.
Even without the distraction of “debt”, the underlying issues have been misunderstood for decades.
The reality of software-engineering is that “shortcuts” are often a good idea, but people are prejudiced against them. It’s often better to release a new version faster and then fix it later rather than spending more time doing it right the first time. But everything believes such shortcuts are wrong, shortsighted, and immoral.
This is the core of what makes Agile different than the original Waterfall method. It’s why people struggle with Agile, they are trying to simultaneously achieve the manifesto “release early and often” and their prejudice “do it right the first time”. You can’t do both, you have to compromise.
Ward Cunningham who coined “technical debt” is one of the original contributors to the Agile methodology. He didn’t claim shortcuts are bad and should be avoided. His point was simply that every release should contain some refactoring, fixing old decisions that have become costly. It’s not a moral crusade, just financial planning. Debt isn’t bad, it’s just a thing.
Thus, we have the two underlying prejudices that “debt” is bad and things like “shortcuts” are bad. This makes most uses of the analogy invalid according to how it was originally intended.
Everything is technical debt
The way people use the metaphor is to to point to the thing they don’t like and call it technical debt. The reality is that all the code is technical debt. As originally intended, the metaphor was about the most costly debt, deciding which to refactor this next release cycle. As used now, the term has been weaponized to pejoratively describe anything the speaker doesn’t like.
Let’s say you leave your house to go on a trip. Your first choice out the door is to turn left or right. If you make the wrong decision, then you’ll have to turn around and backtrack. The longer you go without fixing the problem, the more costs you incur, the further you’ll have to backtrack.
The same is true in software engineering. Every decision you make to reach your goal means moving further away from some another goal.
And goals change. It’s probably the most fundamental law of software engineering is that no matter how much time you spend getting the “requirements” right, they will always change later on.
This means the best written software in the world will always have “technical debt”.
It’s why shortcuts are so important, especially early in a project’s lifecycle. The less time you spent going the wrong direction, the less costly it is to turn around and go some other direction.
Later in a project’s lifecycle, you get an appreciation of what really is the important parts and what isn’t. You then focus your attention on cleaning up the important parts.
Technical debt is a whole project metaphor, treating everything as debt, focusing on what’s most needing to be refactored. It’s not, as most people use it, just about the parts of a project they most hate.
Refactoring is easy
The technical debt metaphor has a companion concept called refactoring. It means reworking, fixing, rewriting, changing the code to fix the problem (or do other necessarily work, like writing automated tests). The reason for a distinctive term is that often more than just the code changes, sometimes it’s a larger change in the design or architecture.
Sometimes such changes are small. For example, the variable name “id” isn’t as meaningful as “employee_id”. An engineer might decide to search-and-replace with the longer variable name in order to make the code more readable.
Sometimes is something more substantial. For example, I recently added IPv6 support to my port-scanner “masscan”. This required making small changes in almost every file of the project, changing all the variables holding IPv4 addresses to something that could hold either an IPv4/IPv6 address. This is annoying because an IPv4 address can be held as a single 32-bit number, but an IPv6 address is 128-bits and has to be represented as an array of bytes.
Sometimes such changes are equivalent in scope to a product rewrite. One of my first jobs out of college was transforming network analysis code originally written in an MS-DOS environment (where an int is 16-bits) that wasn’t modularized to work as a library under any environment, such as Sun Solaris or Windows NT. Since the most basic integer type has to be extended from 16-bits to 32-bits, almost every line of code had to be changed and retested. Also, it had dependencies throughout the project, where user-interface code was embedded in the analysis code. I successfully kept all the original code, just refactored the entire thing.
No matter how transformative or inconsequential the change, it all starts with the same process. You first create an automated testing framework. It’s like a safety net under the code. When you make changes, you can then quickly test the code to make sure you haven’t broken anything.
It’s one of the first things technical debt is used to identify as the long term costs of a project, the lack of an automated testing framework. Changes are just too expensive without it, to the point that you aren’t willing to do the smallest change like renaming “id” to “employee_id”. The more robust the testing framework, the large the changes that can be made.
In other words, once you’ve got automated builds and testing, even large refactoring changes become easy.
Rewriting code is bad
Software-engineers don’t like working on other people’s code. They want to write their own code, from scratch. Many debates in software-engineering are in service of this goal.
They often misuses technical debt to justify this. Often they argue that a project has become bankrupt with too much technical debt, and therefore, the only solution is a rewrite. I have a lot of experience with such claims, and they’ve never been true. All code can be refactored.
The key think you need to know is that old code has technical capital. It contains a lot of things that either weren’t documented or can’t be documented. It works, but nobody is certain why. When people attempt to rewrite it, they miss all the small details, and create something that only works in theory but not in practice.
These programmers look at the garbage code of the old project and believe their new rewrite will be clean. This is false. Programmers have different areas of expertise. While they may see how to cleanly rewrite the garbage of the old project, their new project will contain different garbage in the other areas outside their expertise.
Moreover, most programmers are bad at things like modularization. Modules are important in the design of software projects. When things are modularized correctly, then the impact of changes don’t spread very far, stopping at the borders of each module. When things are poorly modularized, minor changes contaminate other modules.
A common refactoring cleanup of technical debt is to identify modularization problems and clean them up.
Programmers are incredibly bad at designing modules in new code. They have all sorts of great sounding arguments for doing the wrong thing. A common one is that they are building for the future, that the extra work now in building extra functionality will pay off in the long run. It’s part of the DRY (don’t repeat yourself) philosophy.
This is false. Any functionality you add now that isn’t needed now becomes technical debt for future programmers. It won’t actually solve any future need. If it’s close, then the module still needs to be changed to adapt it to that need. Now you have changes propagating elsewhere in the project, causing other users of that module to change.
A lot of companies fall into this trap. A startup’s v1.0 (with tons of technical debt) is wildly successful, but brittle. They struggle to keep up with the changes and bug fixes demanded by customers. So they decide to rewrite from scratch, creating v2.0 without the technical debt, taking the time to design things right, creating modules that can then enable reuse in the future.
Such projects almost always fail. In big companies, they eventually figure out how just refactor the old code and abandon the v2.0 effort. Sometimes they simply abandon their original project and buy a competitor. In small companies, they often go bankrupt.
With enough investment, some do eventually ship a v2.0. But then it simply becomes the new technical debt, with programmers struggling to move forward because a lot of refactoring is needed.
The point is that one of the ways that the technical debt concept is misused is when rewriting new code from scratch, the promise that it won’t have the problems of the old code. This is usually wrong. Fixing technical debt means refactoring old code, not declaring the old code bankrupt and incurring new debt.
Business and finance
The misuse of technical debt comes from the misunderstanding of debt. Most people think debt is bad, often morally bad.
But in business, debt is good — or at least neutral. Debt is just a form capital. Businesses need to capital to run.
There are two kinds of capital: equity (stock) and debt (loans, bonds). Companies are largely indifferent to the type of capital. When companies need to raise capital, such as when building a new factory, sometimes they’ll issue stock, sometimes they’ll issue bonds. There are small reasons that push them to choose one over the other, such as current market or regulatory conditions, but overall, they don’t care, the two are largely equivalent. Sometimes they’ll replace one with the other, such as issuing bonds to buyback stock, or issuing stock to payback bonds.
The major issue is that issuing stock is slightly more expensive, but less risky. Bonds are less expensive capital, but more risky. (This is the reverse point of view from investors, where bonds are considered safer and stocks more risky).
There is really no concept of paying down debt to avoid future interest payments. If you issue stock to pay down debt, then you’ve just replaced one liability with another. Instead of having to pay interest on bonds, you now have to pay stock dividends.
The idea of technical debt is that these long term payments should be avoided. That’s not how business leaders think. In so many other parts of the business, they’d rather have long term payments than short term buyouts. For example, they’d rather lease office buildings than buy them. Airlines typically lease almost all their airplanes rather than own them. Instead of issuing bonds/stock to build a factory, maybe they’ll asks somebody else to build it, then lease the factory. Long term payments are a good thing.
Project managers have to make decisions about what to build for the next release. They’ll decide to add features, fix some bugs, and refactor some technical debt. Each pays long term dividends for the project. They are indifferent which to focus on, that technical debt is “debt” has no special meaning.
Think about it from a company’s point of view. They had an extra $10 million in profits this quarter. They could invest that in paying down technical debt in the software code base. Or they could use that money to payback real debt (bonds). Most every company has some debt, so engineers claiming technical debt now becomes a debate quantifying which is better, paying down real money debt or paying down technical debt.
The two aren’t as fungible as that, of course. The point is that technical debt isn’t a financial term. You think you are speaking business language to business leaders, but you really aren’t.
Conclusion
As originally coined, technical debt is a fine metaphor. Every software release release should contain some refactoring efforts addressing the things that have the most long term costs for a project. It’s part of the “agile” ideas of software development of shipping early and fixing later.
But it’s too often used in other ways. Instead, it’s become a toxic concept based on the misconception that debt is bad, that it’s morally wrong, that it’ll bankrupt projects. As used by most people, it simply means “something that needs to be fixed/rewritten”, abandoning the original justification. It’s used to pejoratively label the specific thing they want changed, rather than as a general management concept.
If you hear the term when discussing what next needs to be refactored in the code base, then it’s a good usage of the metaphor. Almost all other uses are bad.
I may have a distorted view. I work primarily in cybersecurity industry where it’s always bad. But I come from the software development industry where it’s often still bad. I’m not saying we should abandon the metaphor, only point and laugh at those who misuse it.
Bad metaphor in action
Here are some tweets where we see it in action.
This is a typical use in cybersecurity. They see lack of security as a moral weakness, and technical debt as a moral weakness, so they describe existing insecure systems as a technical debt. But it doesn’t match the metaphor. Leaving vulnerabilities unfixed isn’t causing long term “interest payments”. It’s not any costs at all. Sure, you’ll have to spend money to replace them with adequately secure systems, but this insecurity isn’t their problem, it’s your problem, because you don’t like their security.
It’s not techies talking to business. Techies get annoyed when ignorant business types try to use techie buzzwords. Trying to use “technical debt” to talk to a business type is the same thing going the other direction. It’s used by techies for techies when deciding what code to write next.
It has nothing to do with “building your tower on quicksand”. It’s not about building new software, where building on quicksand might be the right solution. It’s about priorities what to do next, identifying quicksand is actually a problem and that you need to drive pylons into the ground to address the problem.
Elon Musk is experiencing this. He took engineering shortcuts when launching his “Starship” booster. Because of soil compaction, the rockets blasted a hole under the launch structure. He’s now driving pylons into the ground to stabilize the soil so that this won’t happen in the next launch.
That technical debt “compounds” or “cumulates” is a myth. It’s a steady cost rather than something that spiraling out of control to bankruptcy.
Companies still with legacy systems because they were built incrementally over time and cannot be rewritten. Often, they’ve tried, with costs spiraling out of control, with the release date moving forward 2 years for every year of development. Companies stick with legacy, paying that legacy tax, because they really have no viable option.
Everybody runs legacy, end-of-life, no longer supported software and hardware. It’s a fact of life. As mentioned in the paragraph above, they do so for every good reasons. It’s the least costly of alternatives, and yes while there are ongoing costs, “interest payments”, they’d rather pay them.
This confuses technical debt with the common prejudice that “it costs less to fix it early rather than later”.
Instead, the general idea is that it’ll cost roughly the same to fix it, whenever you decide to do so. If you own $1000, you’ll payback $1000 whenever you pay the principle. It’s just that in the meantime, you are paying interest. Until you fix the issue in the code, it’s incurring costs.
Technical debt isn’t “failing to address problems”. That’s not the point. The problem doesn’t need to be fixed. It’s just that until it’s fixed (paid down the principle), it’s incurring costs (interest).
Grady Booch is a famous software-engineer responsible for championing anti-Agile ideas of the 1990s (which he now defends as being compatible with Agile).
The old anti-Agile thinking is the idea that the solution to legacy problems is Big Projects, doing “system wide upgrades”. The idea of agile refactoring paying back technical debt is lots of small projects. If it takes 3 months to refactor something, then you are understanding technical debt. If it can’t be solved without a complete overhaul of the system, you are doing it wrong.
This tweet looks like much the same nonsense, but when I read the article, it was actually a reasonable use of the term. The article comes to the correct conclusion:
Organizations must understand fully what parts of their business depend on legacy applications and processes, understanding the impact of those outcomes, and make educated decisions to either cut over to new (new app development in cloud-native) or transition (perhaps via presentation layer) to a completely rearchitected platform.
Yes, it’s a bit too focuses on the “cloud” as the preferred solution to all problems, but still, it says the correct core thing: find the legacy costly bits that keep incurring large costs and address them with appropriate fixes. It’s not prejudicial about any particular problem.
This isn’t true. Technical debt is simply rolling up your sleeves and dealing with past decisions that are incurring ongoing costs. There’s no prejudice or judgement about where those decisions came from.
There’s a hindsight bias, that these problems wouldn’t exist if those in the past would’ve just thought more about tomorrow. It’s sometimes true, of course, but it’s generally not true. It’s just a prejudice that people have, that bad decisions are due to moral weakness, such as laziness or sloth.
Even when it’s true, it’s no true. The cases where it’s most often true is when people throw together some crap code to get something done fast, planning to fix it in a couple months, and then 20 years later find out it’s grown into some huge project where everyone complains about the crap code at the center.
I’ve experienced this personally. I tossed together a prototype UI using Microsoft MFC sample code. It was crap, I’m a backend systems programmer, not a UI programmer. It was always intended to be rewritten from scratch. They did attempt ta rewrite, but included 20 megabytes of bloat using three different XML libraries just for license key management and when the shipped it, customers rejected it, and they went back to the old code.
I made the correct decision at the time and would do so again, even as those engineers complained bitterly at my s***t code. They were right, it was, but that doesn’t change anything.
This “if left uncheck” cliché is common. It’s not a thing. The correct use of the analogy allows technical debt to unchecked. It’s not a problem, it’s just that it keep incurring costs each development cycle until it’s fixed.
This idea that it’s some sort of dirty thing that hurts the company is false. Sure, companies that can’t quickly change their product to fit changing market needs will suffer. But it’s not bad, it’s just neutral.
This tweet gets it right. Just roll up your sleeves and deal with the issues.
Run away from anybody preaching how to avoid technical debt when writing new code.
Sure, there is a lot of good things to learn about the topic, but they all apply more to refactoring existing code than creating new code.
Moreover, it’s not something you can really learn in the abstract. You need practice refactoring existing code before you can truly understand how to write new code without those sorts of problems.
I was thrown into the deep end refactoring technical debt out of college. This has made me skilled at avoiding it for new projects. It’s the only path.
These are the sorts of meme arguments about moral weaknesses like sloth. In our hearts, we agree with what’s being preached here, that we need to fix things now. And sometimes that’s true.
But most of the time, it’s false. It’s more likely that getting it released sooner is the better answer than “doing it right” and shipping later. Competent teams can always fix it later, and if you don’t have a competent team, it’s likely that you aren’t “doing it right” anyway — just doing it slow.
Almost any use of technical debt in the cybersecurity context is laughably wrong. Sure, if you don’t address security concerns early and ship without them, then you risk getting hacked. It’s a real issue, but it doesn’t really match the technical debt metaphor.
People associate technical debt with moralizing, like “don’t put off tomorrow what you do today”. So it all seems vaguely related.
But this isn’t what technical debt means. Sure, maybe you get hacked because you didn’t put in the security features you were supposed to. But this is a one time cost, not a steady stream of “interest payments”. If technical debt is expanded to include this, then everything is technical debt.