Stop Micromanaging the Model
In 2023, step-by-step prompting was good advice. Models were powerful but distractible. They’d lose the thread halfway through a complex task. Chain-of-thought helped. Breaking the problem into pieces helped. Walking the model through your reasoning helped. People learned to write prompts like procedures. Do this, then this, then this.
The models moved. The prompts didn’t.
The Habit That Became a Handicap
Frontier models in 2026 reason. Not metaphorically. They plan multi-step approaches, consider alternatives, backtrack when something isn’t working. The gap between “smart autocomplete” and “capable reasoning engine” closed faster than anyone’s prompting habits adjusted.
The result is a specific kind of failure that I see constantly. Someone writes a prompt like this:
Read the codebase. First look at the directory structure. Then read the README. Then identify the entry point. Then trace the main function. Then list all the API endpoints. Then for each endpoint, describe the request and response format.
Every instruction here is reasonable. The sequence is logical. And the whole thing is a straitjacket. You just told a system that can reason about code to follow your particular investigation strategy. Your strategy. Based on your assumptions about the codebase. Before either of you has looked at it.
A capable model given that same codebase and the instruction “map the API surface: every endpoint, its method, request shape, and response shape” will find a path to that answer. Probably not your path. Probably a better one. It might grep for route definitions. It might read the OpenAPI spec you didn’t know existed. It might find a test suite that exercises every endpoint with example payloads. You told it to start with the README. The README is four years out of date.
Describe the Destination
The English Trap makes the case that interacting with an LLM is specification, not conversation. True. But the specification should be declarative, not imperative. Describe what done looks like. Don’t describe the journey.
This is the same distinction that software figured out decades ago. You don’t write SQL by telling the database which rows to scan and in what order. You describe the result set. The query planner figures out the path. It’s better at choosing paths than you are because it knows things you don’t. Index statistics. Data distribution. Hardware constraints. Your job is to describe what you want. Its job is to figure out how.
LLMs have gotten good enough that the same principle applies. You’re the product owner, not the project manager. Specify the acceptance criteria, not the implementation plan.
What Over-Specification Costs You
When you hand the model a procedure, three things happen.
You cap the solution quality at your own knowledge. If you knew the optimal path, you wouldn’t need the model. By specifying the path, you’re constraining a system that might have found a better one. Every step you dictate is a step the model can’t improve on.
You make failures catastrophic instead of recoverable. A model following your procedure hits a wall at step 4? It’s stuck. It was told to do step 4. Step 4 doesn’t work. Now what? A model given an end goal hits a wall and routes around it. The failure is information, not a dead end.
You’re debugging your procedure instead of evaluating the output. When the result is wrong, was it the model or was it your instructions? With imperative prompts, you’re maintaining two things: the task and the procedure. With declarative prompts, you’re maintaining one thing: the definition of done.
What Good Specification Looks Like
Most people write something like this:
Take this CSV file. First check for duplicate rows. Then remove any rows where the revenue column is empty. Then sort by date descending. Then calculate the month-over-month change for each row. Then format as a markdown table with columns for date, revenue, and change percentage.
Compare:
Turn this CSV into a markdown table someone could drop into a report. Show the revenue trend over time, newest first. Ignore duplicates and rows with no revenue. Add month-over-month change where it can be computed cleanly.
Still clear. Still enough to judge success. The second version says nothing about how to get there. The model might process it in exactly the order the first version prescribed. It might not. It might notice the dates aren’t parsed consistently and fix that, something the first version never mentioned because you didn’t think of it.
The English Trap says be precise. This says be precise about the right things. Specify the outcome. Leave the how to the system that’s going to do the work.
The Trust Problem
I know why people over-specify. It feels unsafe. “What if the model does it wrong?” It might. It also might do it wrong while following your instructions to the letter, and then you’ve lost both ways.
The instinct to control the process comes from working with less capable systems. Early models genuinely needed guardrails because they’d wander off task. Current frontier models don’t wander. They plan. When they go wrong, it’s usually because the goal was ambiguous, not because they lacked a step-by-step guide.
The fix for bad outputs is almost never “add more procedural steps.” It’s “define the success criteria more sharply.” What exactly should the output contain? What should it not contain? What would make you reject it? Those are the constraints that matter. How the model satisfies them is its business.
Room to Move
Give a capable model a tight destination and a loose path and it will surprise you. Not occasionally. Routinely. It’ll find information you didn’t know was available. It’ll use approaches you wouldn’t have thought of. It’ll solve the problem sideways while you were planning to solve it head-on.
The best prompts I write now look like acceptance criteria on a well-written ticket. Here’s what done looks like. Here’s what I’ll reject. Here are the hard constraints. Go.