In plain English
A coding agent is an AI tool — like Claude Code, Cursor, or GitHub Copilot's agent mode — that doesn't just suggest a line of code. It reads your files, runs commands, edits multiple files, and works through a task on its own while you watch. Prompting it well means writing the instruction that kicks all of that off. Get the prompt right and the agent ships a working change in one shot. Get it vague and you'll spend twenty minutes correcting it — or, worse, it confidently builds the wrong thing.
Here's the analogy that makes it click. Imagine handing a task to a brilliant, lightning-fast contractor who has never seen your house, your neighborhood, or your taste. Tell them "fix the bathroom" and they'll guess: maybe they retile the floor when all you wanted was a new faucet. Tell them "replace the leaking faucet under the bathroom sink with a chrome single-handle model, leave the tiles alone, and test that it doesn't drip" — and they nail it. Same contractor, same skill. The difference is entirely in the brief.
A coding agent is exactly that contractor. It is fast, fluent in fifty languages, and completely ignorant of your project until you tell it. The whole skill of prompting an agent is writing a brief so clear that even a very literal, very eager worker can't get it wrong. This is the practical, codebase-facing cousin of prompt engineering — the same instincts, applied to real files and real commands.
Why it matters
When an agent writes wrong code, the model usually isn't the problem — the brief is. The agent can't read your mind. It doesn't know your team uses ES modules, that you avoid mocks in tests, that there's an existing validateEmail helper it should reuse, or that "done" means the test suite passes, not the file looks plausible. Every one of those gaps gets filled with a guess. A well-written prompt closes the gaps before the agent starts guessing.
The cost of a vague prompt isn't just one wrong answer — it's a correction spiral. You ask, it misses, you correct, it half-fixes and breaks something else, you correct again. Now the conversation is cluttered with failed approaches, and the agent is more confused than when it started. A precise opening prompt skips that entire loop. The fastest path to good code is a good first instruction, not a fast typist.
Who should care:
- Anyone using Claude Code, Cursor, or Copilot for real work. The gap between "this tool wastes my time" and "this tool saved me an afternoon" is almost always prompt quality, not the tool.
- Beginners learning to build with AI. A clear prompt is also a clear thought. Writing one forces you to actually understand what you want — which is half the battle.
- Teams shipping agent-written code. Prompt conventions, a good
CLAUDE.md, and a habit of stating acceptance criteria are what keep agent output reviewable and trustworthy at scale.
What did this replace? For a lot of small-to-medium tasks, a sharp prompt replaced the slow loop of writing every line yourself. The new bottleneck isn't typing speed — it's specification speed: how quickly and clearly you can describe the change you want. If you've read what an AI coding assistant is, this is the skill that turns that assistant from a toy into a teammate.
How it works
An agent doesn't answer in one shot like a chatbot. It runs a loop: it reads context, decides on an action (read a file, edit code, run a command), observes the result, and repeats until it thinks the task is done. Your prompt seeds that loop — it sets the goal and the constraints the agent carries through every cycle. A weak prompt means the agent is improvising its goal as it goes.
Notice the Verify step. The single biggest upgrade you can make to any agent prompt is giving the agent a way to check its own work — a test to run, a build to pass, a screenshot to compare. Without a check, "looks done" is the only signal the agent has, and you become the verification loop, catching every mistake by hand. With a check, the loop closes itself: the agent edits, runs the test, reads the failure, and fixes it before it ever hands the work back to you.
The four parts of a strong prompt
Almost every good coding-agent prompt has four ingredients. Miss one and the agent fills it with a guess.
| Part | What it answers | Example |
|---|---|---|
| Context | Where to look, what to reuse | "in src/auth/, following the pattern in session.ts" |
| Task | The concrete change | "add a refreshToken function that swaps an expired token" |
| Constraints | The guardrails | "no new dependencies, keep the existing error format" |
| Done | How success is checked | "write a test for the expiry case and make the suite pass" |
Watch one prompt grow through all four. Start with the bad version: "fix the login bug." The agent has no idea which bug, where, or how it'll know it's fixed. Now the strong version: "Users report login fails after the session times out. The bug is likely in the token refresh logic in src/auth/. Reproduce it with a failing test first, then fix the root cause — don't just suppress the error. Run the test suite and make sure it passes." Context, task, constraint, and definition of done, all in three sentences. That prompt can run unattended; the first one can't.
A prompt you can run
Let's make this concrete with a real before/after on a task you might actually give an agent: adding input validation to an API handler. Here's the weak prompt, the kind that triggers a correction spiral:
add validation to the signup endpointThe agent has to invent everything: which file, which fields, what counts as valid, what to return on failure, and whether to test it. Now the engineered version — same task, fully briefed:
Add input validation to the signup handler in src/routes/signup.ts.
Context:
- Look at src/routes/login.ts first — follow the same validation
pattern it uses (the validateBody helper).
- Reuse the existing isValidEmail helper in src/utils/validators.ts.
Do not write a new email regex.
Task:
- Reject the request with HTTP 400 if email is missing or invalid,
or if password is shorter than 8 characters.
- On 400, return { "error": "<reason>" } matching the existing
error shape in login.ts.
Constraints:
- No new npm dependencies.
- Keep the happy-path response unchanged.
Done when:
- You've added tests in src/routes/signup.test.ts covering missing
email, invalid email, and short password.
- `npm test` passes. Run it and show me the output.Read what each block buys you. "Look at login.ts first" points the agent at a working pattern so it doesn't reinvent your conventions. "Reuse isValidEmail" stops it from writing a buggy duplicate. The explicit status code and error shape make the output predictable. "No new dependencies" blocks the classic agent reflex of npm install-ing a validation library you didn't want. And "npm test passes — run it and show me" turns the agent into a self-checking worker instead of a hopeful one.
In a real session with a tool like Claude Code, you'd paste that straight in. You can also reference files directly so the agent reads them before it acts:
# Point the agent at the files instead of describing them.
# Claude Code reads @-referenced files before responding.
claude "follow the validation pattern in @src/routes/login.ts and
add the same validation to @src/routes/signup.ts. write tests and
run npm test until it passes."
# Run it non-interactively in a script or CI:
claude -p "fix all eslint errors in src/, then run npm run lint to confirm"Common mistakes that produce wrong code
Most "the AI wrote bad code" complaints trace back to a handful of prompting mistakes. Recognize these and you'll dodge the majority of bad outputs.
- No definition of done. If you don't say how success is checked, the agent stops when the code merely looks finished. Always give it a test, a build, or a command to run.
- The kitchen-sink prompt. Bundling three unrelated tasks into one instruction confuses the loop and pollutes context. One task per prompt; reset between unrelated tasks.
- Describing instead of pointing. "Make it like our other widgets" is weaker than "follow the pattern in
HotDogWidget.tsx." Point the agent at a concrete example file. - Treating the agent like a search engine. It can't read your mind about constraints. If you don't want new dependencies or a rewritten module, say so — silence is permission.
- Correcting forever instead of restarting. After two failed corrections, the context is cluttered with dead ends. Clear it and write one better prompt that folds in what you just learned.
- Skipping the review. A plausible diff isn't a correct diff. Have a fresh agent (or a quick
git diffof your own) review the change against your original ask before you trust it.
The deepest of these is the trust-then-verify gap: the agent produces something that looks right and silently doesn't handle an edge case. The fix is structural, not a phrasing trick — bake verification into every prompt. If you can't give the agent a way to check itself, you have to check it yourself, every single time. The agents that run unattended are the ones whose prompts ended in "and make the tests pass."
Going deeper
Once single prompts feel natural, the real leverage moves up a level: managing the agent's context window. An agent's working memory holds your conversation, every file it read, and every command's output — and it fills up fast. As it fills, the model starts forgetting earlier instructions and making more mistakes. This is why the best practitioners reset context between unrelated tasks and keep each prompt scoped to one job. Your prompt is competing for space with everything the agent has already read.
That's where a standing instructions file earns its keep. Claude Code reads a CLAUDE.md at the start of every session; Cursor uses rules files. Put your durable conventions there — build commands, code style, "always typecheck when done," the libraries you prefer — so you don't repeat them in every prompt. The discipline is the same as a tight prompt: include only what changes the agent's behavior, and prune ruthlessly. A bloated rules file gets ignored, because the important lines drown in the noise. This is the codebase-scoped version of context engineering: deciding everything the model sees, not just how you phrase the ask.
The frontier of agent prompting is the explore-plan-implement split. Instead of one prompt that does everything, you separate the phases: first ask the agent to read the relevant code and write a plan (without editing anything), review and edit that plan yourself, then tell it to implement. Separating research from execution is the single most reliable way to stop an agent from confidently solving the wrong problem — it surfaces a misunderstanding while it's still cheap, before any code is written.
For production and unattended runs, two more patterns matter. Adversarial review: after the agent finishes, a second agent in a fresh context reviews only the diff against the original requirements — it isn't biased toward code it just wrote, so it catches gaps the first agent rationalized. And fan-out: for big migrations, you write the prompt once, test it on two or three files, then loop the same prompt across hundreds of files non-interactively. At that scale the prompt is no longer a one-off message — it's a small program you debug and tune. The honest truth of the field is that prompting an agent is converging with engineering management: the people who get great output are the ones who write great briefs, give clear acceptance criteria, and verify the work.
FAQ
Why does my AI coding agent write the wrong code?
Almost always because the prompt left a gap the agent filled with a guess. It can't know which file, which pattern to follow, what counts as "done," or your unstated constraints unless you say so. Add context (which files), constraints (no new dependencies, reuse this helper), and a definition of done (a test or build it must pass), and wrong-code rates drop sharply.
What makes a good prompt for Claude Code or Cursor?
Four ingredients: context (point it at the right files and an example pattern), the concrete task, constraints (the guardrails), and a definition of done it can verify itself — usually a test or build command. "Fix the login bug" fails on all four; "reproduce the timeout bug with a failing test in src/auth/, fix the root cause, and make the suite pass" hits all four.
Should I write one big prompt or go step by step?
For a clear, small task, one well-specified prompt is best. For a big or fuzzy feature, split it: have the agent explore and write a plan first, review the plan yourself, then tell it to implement. Separating planning from coding is the most reliable way to stop it from solving the wrong problem.
How do I stop a coding agent from going off the rails?
Correct it the moment you see it drift — don't let it run. If you've corrected the same issue twice, the context is cluttered with failed attempts; reset and write one sharper prompt that includes what you learned. A clean session with a better prompt almost always beats a long session full of corrections.
Do I still need to review code an agent writes?
Yes. A plausible-looking diff is not a correct diff, especially around edge cases. Give the agent a way to verify itself (tests, a build, a screenshot), have it show you the evidence, and review the final diff against your original ask. For longer runs, have a fresh agent review the diff in a clean context.
What's the difference between prompting a coding agent and a chatbot?
A chatbot answers and waits. A coding agent runs a loop — it reads files, edits code, runs commands, and checks its own work until the task is done. So your prompt has to set a goal the agent can carry through that loop and, ideally, a check it can run to know when to stop, not just a question to answer.