Technical debt has a reputation problem. In most organisations, when engineers bring it up, stakeholders hear one of two things: engineers want to do things the right way and business pressure is making them cut corners, or engineers are complaining again about problems that don’t affect the customer. Neither framing is productive, and both have some truth in them, which is part of why the conversation is so hard.

I’ve been navigating this across two companies now — Capital One, where we managed genuinely complex legacy systems alongside new development, and in my current role leading a large engineering organisation in the cloud — and the thing that’s made the most difference isn’t having a better technical argument. It’s having a better business frame.

Here’s the distinction I try to hold onto: not all technical debt is the same, and treating it like a single category is where the conversation breaks down. I think about debt in roughly three buckets based on cost and urgency.

The first bucket is what I call active drag. This is debt that’s slowing you down right now — every sprint, in ways that are visible and measurable. A data model that made sense three years ago but now requires a patch every time the product team needs to extend it is a good example: the cost is real, it compounds every week you don’t address it, and the business case is actually not that hard to make — you just have to quantify the drag rather than describing it abstractly.

The second bucket is fragility risk. This is debt that isn’t costing you much today but has a meaningful probability of causing a serious problem later — the component that’s three major versions behind and hasn’t been audited for vulnerabilities, the service that only one engineer really understands and that engineer is looking for other roles, the integration that works fine at current scale but hasn’t been load-tested against where you expect to be in eighteen months. The cost here is probabilistic, which makes it harder to prioritise in a planning session, but the conversation gets easier when you describe it in terms of risk management rather than engineering preference.

The third bucket is design regret — decisions that were reasonable at the time but that you now know were wrong, and which are limiting your ability to build new things well. This is the hardest to argue for, because the cost is speculative and forward-looking. The system works, nothing is on fire, and you’re asking for investment to redesign something that customers don’t experience directly.

The frame I’ve found most useful here is option value. When you’re carrying significant design regret, you’re not just working around a bad decision — you’re narrowing the set of future decisions you can make cheaply. Paying down this debt now preserves your ability to make a class of future changes without rewriting large parts of the system. That flexibility has real value, even when it’s hard to attach a number to it. The question I’ve found useful to ask in these conversations is: what product decisions are we not making, or making slowly, because of this constraint? When you can name those, the abstract technical argument becomes a concrete roadmap discussion.

The mistake I see engineering leaders make most often in these conversations is leading with the technical problem rather than the business consequence. “Our service mesh is outdated” doesn’t land. “We can’t safely run more than X requests per second on this service without significant risk, and our product roadmap is planning to send us Y” lands differently.

At Capital One, one of the engineering teams in my org was sitting on a codebase with a substantial layer of undocumented logic that made adding new features genuinely slow and risky. The technical description of what was wrong wasn’t hard to give — any engineer who looked at it understood the problem immediately. What took longer to develop was a way to explain what it was costing in terms a product leader and a finance partner could act on. Working through that with my engineering managers, we eventually got to a version of the explanation that was honest and concrete, and the conversation with stakeholders changed.

The other thing worth saying: not all debt needs to be paid off. Some decisions age out gracefully when the systems they’re part of get replaced. Treating every instance of debt as an emergency is a credibility problem — it makes it harder to be heard when something genuinely is urgent. Triage is part of the job.

Technical debt is a leadership conversation, not a technical one. The engineering judgment tells you what the debt is. The leadership judgment tells you which debt matters enough to fight for.

Further Reading

  1. Containers, Microservices, and the Monolith Nobody Admits They Have
  2. The Relationship Between Cloud Architecture and Business Agility
  3. Infrastructure Is Never Just Infrastructure
  4. Making Architectural Decisions Without Being the Smartest Person in the Room
  5. What Building Products for Enterprises Taught Me About Simplicity