The Tidiest Debt We Ever Carried

06 March 2026
Engineering Craft

A few years ago we rebuilt our quiz and lesson delivery system from scratch.

The existing codebase had accumulated years of debt. The kind where every new feature feels like you're wading through a tar pit. So we bit the bullet, picked a modern framework, and rebuilt it properly. No regrets. The velocity we've had since is night and day. You hear stories about rebuilds gone wrong all the time. This is not that story.

During that rebuild, product and design started talking seriously about assessments. If you're not in EdTech or Teaching, the distinction between a quiz and an assessment might sound like splitting hairs. Both involve a student answering questions. But the meaning of the interaction is completely different. A quiz is low-stakes practice — it's formative, forgiving, a checkpoint. An assessment is a record. It has different integrity requirements, different reporting obligations, different weight in the relationship between a student and their teacher. Similar clothes but a different person wearing them.

We understood this. And because we understood it, we started building toward it.

The new system had the right architecture to support assessments. The componentry overlapped significantly with what we were already building. So as we worked, we roughed in some interfaces — stubs that had no immediate function but codified the concept in our type system. We added a new variant to the discriminated union representing what content could be. We wrote tests that handled it faithfully, because the types demanded it, and because we were doing this properly.

It was genuinely good work. The docking point we prepared was elegant. Modular, compartmentalised, ready to receive the feature when it arrived.

And then other priorities landed.

Assessments would get their time, just not yet. Fair enough. That's how product development works. So we kept building. New features, new data requirements, new test cases. And every time we touched the domain, we had to account for our phantom. Every switch statement needed a case for it. Every new test had to appease the types for a thing that didn't exist yet. The armour was still bulletproof. It was just getting a little heavy.

Then the conversations about assessments started to shift. The shape of the feature as originally imagined didn't quite fit anymore. The domain model would need to change. Our careful preparations — as it turned out — wouldn't work for what assessments actually needed to be.

So we removed it all. Which was, to the credit of the team, quite easy — it was so neatly contained that pulling it out was straightforward. But the real cost wasn't the removal. It was the year of complexity we'd carried in the meantime. The complexity didn't announce itself as debt because it was tidy complexity. The tests written around a thing that didn't exist. The cognitive overhead of every developer who touched that code and had to hold the phantom in their head alongside everything real. PRs with diffs full of the someday-maybe code.

Effort and complexity, in service of a feature that never made it off the ideas notepad.

We say YAGNI — You Ain't Gonna Need It — like it's a simple rule. And in obvious cases, it is. Don't build the plugin architecture for the app that has three users. Don't abstract the thing you've only used once. Easy peasy XP.

But this wasn't an obvious case. We had genuine signal from product and design. We had a clear mental model of how it would work technically. We had the craft to implement it elegantly. None of that felt like speculation — it felt like foresight. We want Senior Engineers to see around corners. We value smart anticipation that makes future work effortless.

That's the trap. Technical clarity is not the same thing as product clarity. We knew exactly how assessments would need to work in the code. That was real knowledge. But whether assessments would actually take that shape and whether the feature would survive contact with users and roadmap realities and changing priorities — that was still wide open. The code was ahead of the decision.

And the confidence that came from understanding the technical problem made it easy to miss that the product problem wasn't solved yet.

I don't think the original decision was wrong, exactly. The signal was genuine, and at the time, building toward it seemed like the prudent thing. But I'd ask a harder question earlier now: do we understand this well enough as a product decision, or just as an engineering one? Because those two things can feel identical from the inside, and they're not.

The code was perfect. It just had no business being there.