AI Doesn't Eliminate the Cost of Complexity. It Just Changes How We Pay For It.

04 March 2026
Engineering Craft

I was talking with one of my senior engineers the other day about some work happening in one of our codebases — work being done by an engineer who is genuinely experienced, but in a different domain. Backend through and through, contributing to something fairly heavily dependent on the React ecosystem. The kind of situation where, through no fault of their own, they're mostly just accepting what the AI is offering up and hoping for the best.

Which, honestly, is fair enough. That's kind of the whole pitch.

But something came up in that conversation that I haven't been able to stop thinking about. The code that was landing didn't quite follow our conventions. Some of the decisions that we'd made deliberately — load-bearing decisions, the kind that encode a bunch of prior thinking about how our system should behave — weren't being replicated. Not because anyone was being careless. Just because the AI didn't know about them, and the engineer didn't either.

One specific example: we have a strong bias toward type safety. In practice, that means we're pretty aggressive about not letting optional fields slip through unchecked.

The reason for this isn't aesthetic. Whenever you introduce an optional field — a property that may or may not be present depending on context — you're also introducing a set of questions every engineer who touches that code has to reckon with:

Why is this optional?
When will it be present?
When won't it be?
What happens downstream in each case?

That cognitive load is real. We guard against it deliberately. Some of us even have SonarLint configured to surface complexity creeping in as we write — a little nudge that says hey, you're making this harder than it needs to be.

The argument for loosening up #

Here's the counterargument, and it's not a stupid one. There's a growing sentiment that if AI is willing to write slightly more complex code — more optional fields, more code paths, more branching logic — that's not necessarily a problem. Because increasingly, it'll be AI reading that code, not humans. And AI doesn't experience cognitive load the same way we do. It can trace ASTs, follow branching paths, orient itself in an unfamiliar codebase, that kind of thing. It can consume frankly baffling amounts of text in seconds. Why optimise for human comprehension when your primary reader isn't human?

I find this argument genuinely interesting. I also think it's wrong.

A brief taxonomy of complexity #

Bear with me for a second, because I think naming things properly actually matters here.

Cyclomatic complexity is the classic McCabe metric — it counts the number of independent paths through code. Every branch, every loop, every conditional adds to the tally. It's structural and measurable, and it correlates reasonably well with defect density.

Cognitive complexity is a newer idea, developed by SonarSource specifically to address the limitation that cyclomatic complexity doesn't capture how hard code is to read. It penalises nesting, breaks in linear flow, the kinds of things that make humans lose their place mid-function. It's explicitly about mental effort.

Both of these metrics have tooling. Both of them have established arguments for why they matter. And both of them are, fundamentally, human-centric. They model the cost of complexity in terms of what's expensive for a person to hold in their head.

What I think we're dealing with here is a third thing. And as far as I can tell, it doesn't have a name yet.

The best description I've landed on is something like contextual complexity — the amount of implicit, non-local knowledge required to work correctly in a codebase. The conventions, the patterns, the prior decisions that aren't visible in any single file. The domain knowledge that isn't neatly codified in the existing implementation details. The reasoning that lives in commit history, in Confluence pages nobody reads, or just in the heads of the people who were there when the call was made.

This kind of complexity doesn't show up in SonarLint. You can write code that's clean, readable, and scores well on every existing metric, and still be quietly accumulating it. The optional fields example is one instance. But it's broader than that — it's anything where understanding what the code should do requires context that the code itself doesn't carry.

Peter Naur's Programming as Theory Building implies that these gaps are impossible for human programmers to bridge, and there's no reason to believe AI can adequately recreate the Theory of the Program either.

Complexity hidden in the unknown, unspoken, uncodified is a problem for humans and AI alike.

The Cost of Contextual Complexity #

LLMs don't have infinite context. They have a fixed window, and — this is something we keep observing in practice — quality degrades well before they hit it. In my experience, you can start seeing meaningful degradation at around 50% of the available token window. Maybe less.

So what happens when you let contextual complexity accumulate unchecked? Each optional field is a branching point. Each divergent pattern is something the agent has to load, reconcile, and hold. A codebase with enough of this starts to look less like a tree and more like a graph — deeply interconnected, where understanding any one part requires potentially loading a lot of other parts.

For a human engineer, a graph-like codebase is miserable. For an LLM working within a context window, I'd wager it's worse. A senior engineer may just know the relevant parts from experience. An LLM has to load them. Every time.

So the argument that AI can handle messy code so we don't need to be rigorous is a bit like saying we don't need to manage technical debt because future developers can just figure it out.

We know how that ends.

The second-order problem #

There's something else worth naming here.

When an engineer works outside their domain and accepts what the AI produces, they're not doing anything wrong in the moment. Claude (or whatever tool they're using) probably gave them something that works. It solves the immediate problem. The PR will likely pass a review from someone who doesn't know the area well either.

But what's missing is the reasoning behind the existing patterns. Strong typing, minimal optionals — these aren't arbitrary. They encode prior decisions made by people who were thinking carefully about the shape of the system. When an AI generates code without access to that reasoning, it produces something locally coherent but globally inconsistent.

And then the problem compounds. Because now future AI-assisted work has to navigate two patterns instead of one. And then three. The context budget gets spent faster, the outputs get noisier, and slowly the codebase becomes less legible to the very tool everyone is relying on to maintain it.

I don't know exactly where the tipping point is. My hunch is that we're going to find out the hard way in a lot of codebases over the next few years.

So what do we actually do about it? #

Honestly, I'm still working this out. But I think the starting point is recognising that "AI can handle complexity" is not the same as "complexity is now free."

The cost doesn't disappear. It shifts. And it shifts into a resource — context window capacity, output quality at depth — that is, if anything, more constrained than human attention, not less.

We have tooling for cyclomatic complexity. We have tooling for cognitive complexity. We have nothing equivalent for contextual complexity — the implicit, non-local kind — and I suspect that gap is going to matter a lot more as AI becomes a primary actor in our codebases rather than an occasional contributor.

Which means the question isn't whether to have code quality standards. It's whether our existing standards are the right ones for a world where AI is doing a lot of the reading and writing, and whether we need to think about new ones specifically designed with that in mind.

That second framing is the more interesting one, I think. It's not "keep doing what we were doing." It's "what does good code actually mean now?"

I don't have a clean answer yet. But I'm pretty sure "let the AI write whatever and trust it'll sort itself out" isn't it.

Previous: When feedback is wrong
Next: The Tidiest Debt We Ever Carried