AI/TLDR

What Are Subagents? Spawning Helper Agents for Subtasks

Understand subagents — short-lived helper agents launched for one subtask with their own clean context — and how they keep the main agent focused.

INTERMEDIATE10 MIN READUPDATED 2026-06-13

In plain English

A subagent is a fresh AI agent that a main agent launches to handle one specific subtask, then throws away when the subtask is done. It starts with a clean, empty context window — it knows nothing about the bigger job except the small instruction it was handed. It works on its piece, reports back a short answer, and disappears.

Subagents — illustration
Subagents — towardsdatascience.com

Think of a busy chef running a kitchen. The head chef holds the whole dinner service in their head — every table, every course, the timing. When one dish needs a sauce reduced, the head chef doesn't stop and do it personally while juggling everything else. They turn to a line cook and say: reduce this sauce until it coats a spoon, then tell me when it's ready. The line cook focuses on that one thing, ignores the rest of the kitchen, finishes, and reports "sauce is done." The head chef never had to carry the details of stirring in their own head.

That line cook is the subagent. The head chef is the parent (or orchestrator) agent. The key idea: the subagent has its own short-term memory, separate from the parent's. It can churn through a hundred messages of searching, reading, and trial-and-error, and the parent only ever sees the clean two-line summary at the end.

Why it matters

Every agent has a hard limit: the context window. All of its instructions, tool results, and reasoning have to fit inside a fixed token budget. On a long task this budget fills up fast — and once it's full, the agent forgets early details, slows down, and costs more on every single step because it re-reads its entire bloated history each turn.

Subagents are the main tool for protecting that budget. The expensive, noisy work happens inside the subagent's context, which is discarded afterward. Only a compact result crosses back into the parent. This solves several problems at once:

  • Context isolation. A research subagent might read 40 web pages to answer one question. If the parent did that itself, those 40 pages would clog its context for the rest of the job. The subagent absorbs all that text and hands back a three-sentence finding.
  • Focus. A fresh agent with one clear instruction and no distractions reasons more reliably than one agent trying to hold ten half-finished threads at once. Narrow scope means fewer mistakes.
  • Parallelism. Independent subtasks can run as separate subagents at the same time. Exploring five candidate solutions in parallel is far faster than checking them one after another.
  • Reusability. A well-defined subagent — "search the codebase for X," "summarize this document" — becomes a clean building block the parent can call again and again, like a function.

Who cares? Anyone building an agent that does real, multi-step work: a coding assistant exploring a large repository, a research agent gathering sources, a customer-support agent that needs to look up several records. The moment a task is too big to fit comfortably in one context, subagents are usually the answer.

How it works

Spawning a subagent looks, from the parent's side, almost exactly like calling a tool. The parent decides it needs help, packages up a focused instruction, fires off a subagent, and pauses until the result comes back. The crucial detail is what is and isn't shared.

The handoff in, the summary out

When the parent spawns a subagent, it writes a task brief: a small prompt describing exactly what to do and what to return. The subagent does not inherit the parent's conversation history — that's the whole point. It starts empty, so the parent must explicitly include any background the subagent actually needs. Anything left out, the subagent simply won't know.

On the way back, the subagent's full transcript — every tool call, every dead end, every page it read — is discarded. Only its final message returns to the parent, where it lands as a single, compact tool result. This asymmetry is the magic: a thousand tokens of messy work in, fifty tokens of clean answer out.

Each child above has its own private context. They don't see each other's work and they don't see the parent's. The parent collects their three summaries and weaves them together. Below is what spawning one looks like in rough pseudocode — note that the result the parent stores is just text, not the subagent's whole history.

spawning a subagent (sketch)python
def run_subagent(task_brief, tools):
    # Fresh context: the subagent starts empty, knows only the brief.
    sub = Agent(
        system="You handle ONE subtask. Return a short, final answer.",
        tools=tools,
    )
    # The subagent runs its own loop: it may search, read, retry
    # many times. All of that lives in ITS context, not the parent's.
    result = sub.run(task_brief)      # full transcript stays inside `sub`
    return result.final_text          # only the summary escapes

# Parent side: this looks just like calling a tool.
finding = run_subagent(
    task_brief="Find our current refund window for physical goods. "
               "Return just the number of days and the source.",
    tools=[web_search, read_page],
)
# `finding` is ~1 line. The 40 pages the subagent read never touch
# the parent's context window.
messages.append({"role": "tool", "content": finding})

What subagents are good for

Subagents shine whenever a subtask is self-contained, context-heavy, or repeatable. A few patterns show up again and again in real systems.

Use caseWhy a subagent fits
Deep researchReading many sources produces a lot of text; the parent only needs the synthesized finding, not the raw pages.
Codebase searchGrepping and reading files fills context with noise. A subagent returns just "the auth logic lives in auth.py, here's how it works."
Parallel explorationTry several approaches at once — five subagents draft five solutions, the parent picks the best.
Specialized rolesA "reviewer" subagent or a "tester" subagent with its own focused instructions and tools, called when needed.
Untrusted, bulk readingSummarizing long documents or emails in isolation keeps their bulk — and any hidden instructions — out of the parent.

The common thread: in every case, the process generates far more text than the answer. The subagent eats the process so the parent can keep the answer. Where this is not worth it: tiny lookups (a single tool call the parent can just make itself) and tasks where the parent constantly needs to see the intermediate work.

Subagent vs multi-agent system

These terms overlap, which trips people up. A subagent is a primitive — one helper agent spawned with an isolated context for one subtask. A multi-agent system is an architecture — a whole design where multiple agents coordinate. Subagents are often the building block, but not every multi-agent setup uses the spawn-and-summarize pattern.

The most common way to use subagents is the orchestrator-worker pattern: one parent plans the work and spawns worker subagents for each piece. That article is about the architecture — how to split and coordinate the work. This one is about the piece it spawns: what a single subagent actually is and how its isolated context behaves.

There's also the flatter case where peer agents talk directly to each other rather than through one boss — increasingly via standardized agent-to-agent protocols. Those peers usually keep their own ongoing context and aren't discarded after one task, so they're better called agents than subagents.

Going deeper

Subagents are powerful but not free. The same isolation that protects context also creates the hard problems. Once the basics click, these are the trade-offs to understand.

No shared context is a double-edged sword. Because the subagent starts blank, it can only act on what's in its brief. If the parent forgets to pass a key constraint — the user's actual goal, a deadline, a format requirement — the subagent will confidently produce something off-target. Most subagent failures are really bad-brief failures. Spend your effort on writing tight, complete briefs.

Coordination has a cost. Every spawn adds latency (the subagent runs a whole loop) and tokens (the brief out, the summary in). For a one-line lookup, that overhead isn't worth it — the parent should just do it. Subagents pay off when the subtask is genuinely heavy. Spawning a subagent for trivial work is a classic over-engineering mistake; see do you even need an agent for the broader version of this question.

The summary is a lossy bottleneck. Whatever the subagent didn't include in its final message is gone for good. If the parent later needs a detail the subagent saw but didn't report, it has to spawn the work again. Design briefs so the summary captures everything the parent could plausibly need next — including, often, a pointer to where the full detail lives (a file path, a URL) rather than the detail itself.

Parallel subagents can't see each other. Three subagents running at once won't coordinate or deduplicate work — they don't share context by design. If two of them need to agree, that agreement has to happen in the parent after they all return. For tasks that need tight back-and-forth between workers, a single agent or a more connected multi-agent design often beats fan-out subagents.

Security carry-over. A subagent that reads untrusted content (web pages, user files, emails) can be hit by hidden instructions in that content. Isolation helps — the malicious text stays out of the parent — but the subagent itself can still be steered, and it may pass a poisoned summary upward. Validate what comes back; don't treat a subagent's summary as automatically trustworthy just because it's short.

The durable lesson: a subagent is a context boundary with a job attached. Use it when isolating heavy work pays for itself, write the brief as if the subagent can read your mind about nothing, and design the return summary as carefully as the input. Get those right and subagents are the cleanest way to scale a single agent into something that can tackle work far larger than one context window could ever hold.

FAQ

What is a subagent in AI?

A subagent is a fresh AI agent that a main agent spawns to handle one specific subtask. It starts with its own clean, isolated context window, works on its piece, returns a compact summary, and is then discarded. The point is to keep the heavy, noisy work out of the parent agent's limited context.

How is a subagent different from a multi-agent system?

A subagent is a single primitive — one helper agent spawned for one subtask with an isolated context. A multi-agent system is the broader architecture of multiple agents coordinating. Subagents are a common building block for multi-agent systems, but not every multi-agent design uses the spawn-and-summarize pattern.

Why do subagents save context?

All of the subagent's intermediate work — searching, reading, retrying — happens inside its own context window and is thrown away afterward. Only its short final summary returns to the parent. So a subagent that read 40 pages might hand back just three sentences, keeping those 40 pages out of the parent's budget.

Does a subagent share memory with its parent agent?

No. A subagent starts with an empty context and does not inherit the parent's conversation history. The parent must explicitly include any background the subagent needs in its task brief. On the way back, only the subagent's final message reaches the parent — the rest of its transcript is discarded.

When should I not use a subagent?

Skip subagents for trivial work — a single tool call the parent can make itself doesn't justify the extra latency and tokens of spawning a whole agent. Also avoid them when the parent needs to watch the intermediate steps, or when several workers must coordinate tightly, since subagents can't see each other's context.

How does a result get back from a subagent to the parent?

The subagent's full transcript is discarded and only its final message returns, landing in the parent as a single compact tool result — much like the output of a normal tool call. Because this summary is lossy, good briefs ask the subagent to include everything the parent might need next, or a pointer to where the full detail lives.

Further reading