AI Doesn't Replace the Spec Problem

The production line got faster. The inspector didn’t.

AI has made writing code dramatically faster. A developer can describe what they want in a prompt and get a working implementation in seconds. Code that used to take hours takes minutes. Entire features materialize from a conversation.

This is genuinely remarkable. And it has changed nothing about the fundamental problem.

The specification problem, the gap between what software is supposed to do and what it actually does, hasn’t improved. It’s gotten worse. Because the bottleneck was never writing code. The bottleneck was always knowing what the code should do, verifying that it does it, and maintaining that knowledge over time.

AI accelerated the part that wasn’t the bottleneck.

More code, less understanding

Before AI code generation, a developer who wrote a function understood it, at least at the moment they wrote it. They made the design decisions. They chose the algorithm. They handled the edge cases. The knowledge of what and why existed in their head, even if it was never written down.

With AI-generated code, that’s no longer guaranteed. A developer can prompt an LLM, receive a working function, integrate it into the codebase, and ship it, without ever fully understanding what it does under all conditions. The code runs. The tests pass (if there are tests). The feature works in the demo. But does anyone know what happens when the input is malformed? When the network is slow? When the data is larger than expected?

The code doesn’t carry that knowledge. The developer who accepted it may not have that knowledge. And the specification that would have captured that knowledge was never written.

The result is a new category of technical liability: code that works but nobody understands. Not legacy code that’s been around too long. Brand new code that was never understood in the first place.

Vibe coding

There’s a term gaining traction in the industry: vibe coding. It describes the practice of generating code through AI prompts, iterating until it seems to work, and shipping the result, without ever deeply understanding the implementation.

It’s a catchy name for a real phenomenon. And it’s not necessarily irresponsible, for prototypes, scripts, and throwaway projects, it’s efficient. The problem is when vibe-coded implementations end up in production systems that handle money, health data, infrastructure, or anything else where correctness matters.

The vibe coder isn’t lazy. They’re rational. The AI produces code faster than they can write it. The code appears to work. There’s no specification that says otherwise. Why would they slow down to understand every implementation detail when the output is functional?

This is the trap. Without a specification to verify against, “it works” is the only standard. And “it works” is a dangerously low bar.

The verification gap

Verification requires a standard. You can’t verify code against nothing.

In traditional development, the standard was supposed to be the specification. The spec says what should happen; the code makes it happen; testing confirms they match. In practice, most teams skipped the spec and relied on the developer’s understanding as the implicit standard. It was imperfect, but at least the person who wrote the code had some mental model of what it should do.

AI-generated code breaks even that imperfect system. The developer’s mental model may be “I described what I wanted in a prompt and the AI gave me something that seems right.” That’s not a specification. It’s not even an implicit one. It’s a hope.

When a codebase accumulates enough AI-generated code without specifications, you end up with a system that nobody can fully reason about. The AI that generated the code doesn’t remember it. The developer who accepted it may not understand it. The specification that would have made it verifiable doesn’t exist.

This is the verification gap. The distance between the volume of code being produced and the ability to verify what it does. AI is widening that gap at an unprecedented rate.

The trust problem

There’s a difference between code that runs and code that should be trusted.

Code that runs passes its test suite (if it has one) and doesn’t throw errors in normal operation. Code that should be trusted has been verified against a specification that defines correct behavior across all expected conditions, including edge cases, failure modes, security constraints, and performance requirements.

AI-generated code can easily clear the first bar. It’s very good at producing code that runs. But it can’t clear the second bar on its own, because the second bar requires a specification, and the AI doesn’t write specifications. It writes implementations. The spec has to come from somewhere else.

If the spec doesn’t exist, the AI has nothing to be held to. And neither does the developer who accepted its output. The code runs, but there’s no way to know if it’s correct, because “correct” was never defined.

AI needs specs more than humans do

The developers who needed specifications least, the experienced engineers who carried the full mental model in their heads, are being supplemented by a tool that needs them most.

An experienced engineer writing code by hand has implicit specifications: years of domain knowledge, awareness of edge cases, understanding of the system’s constraints. They don’t always write these down, but the knowledge exists and informs their implementation.

An LLM has none of that. It generates code based on statistical patterns. When given a clear, structured specification, it can produce remarkably faithful implementations. When given a vague prompt, it produces code that’s plausible but unverified, because it has no way to distinguish between what the code should do and what seems like a reasonable guess.

Specifications aren’t just a nice-to-have in an AI-driven development world. They’re the mechanism that makes AI-generated code trustworthy. Without them, AI is just a faster way to produce code that nobody can verify.

The specification imperative

AI didn’t create the specification problem. The industry has been shipping software without adequate specifications for decades. But AI has turned a chronic condition into an acute crisis.

When code was written slowly and by hand, the pace of production roughly matched the pace of understanding. You could get away without specs because the people who wrote the code understood it, at least temporarily. The knowledge was fragile and ephemeral, but it existed.

AI has decoupled production from understanding. Code is produced at a rate that far exceeds any individual’s ability to comprehend it. The only way to close that gap is to make the specification explicit, to define what the software should do in a form that’s durable, verifiable, and independent of the person (or machine) that wrote the code.

The choice is between a software industry that uses AI to build faster and one that uses AI to build faster without knowing what it built.