What Is an Agent Scratchpad? Working Memory Inside the Loop

Q: What is the difference between a scratchpad and long-term memory?

A scratchpad holds *in-task* state and is thrown away when the current task ends; long-term memory holds facts *across* separate tasks and sessions. A simple test: if the information should be forgotten the moment this job finishes, it's scratchpad; if it should still matter next time the agent runs, it's long-term memory.

Q: What is the difference between a scratchpad and a to-do list?

A to-do list is one *form* of scratchpad — a structured one. The scratchpad is the broader idea of in-task working memory, which can also hold intermediate findings and freeform reasoning, not just a checklist. Many agents use a to-do list as the backbone of their scratchpad because checkable items are easier to stay anchored to than prose.

Learn what an agent's scratchpad is, how it serves as working memory within a single task, and why it differs from both context and long-term memory.

BEGINNER11 MIN READUPDATED 2026-06-13

In plain English

An AI agent works on a task over many steps: it reads a goal, takes an action, looks at the result, decides what to do next, and repeats. But the model that drives the agent has no memory of its own between steps. Every time it thinks, it sees only the text you hand it. So where does it keep track of the plan so far, what it has already tried, and the half-finished result it is building?

Scratchpad & Working Memory — illustration — Scratchpad & Working Memory — technipages.com

That place is the scratchpad. It is the agent's working memory: a running area of notes, plans, and intermediate results that lives inside a single task and travels with the agent from step to step. Think of it as the agent's notepad or the back of an envelope it scribbles on while solving a problem.

Picture a person doing long division on paper. They don't hold every digit in their head — they write the partial products in the margin, cross out what's done, and carry numbers down. The paper isn't the answer; it's the messy thinking that leads to the answer. An agent's scratchpad is exactly that margin. It might be literal text in the conversation ("Plan: 1) search docs, 2) read file X, 3) write summary"), a structured to-do list, or an actual file on disk the agent reads and rewrites as it goes.

Why it matters

A language model is stateless. Each call is a fresh start — it remembers nothing from the previous call except what you re-send as text. For a one-shot question that's fine. For a task that takes twenty steps, it's a serious problem: without a scratchpad, the agent would forget its own plan halfway through and start over, or repeat work it already finished.

The scratchpad solves three concrete problems builders hit the moment they move from a chatbot to a real agent.

Staying on track over many steps. A plan written down at step 1 is still visible at step 15. The agent can re-read its own to-do list, check off finished items, and see what remains instead of drifting off-task.
Holding intermediate results. A research agent gathers facts from five searches before it writes anything. Those facts need somewhere to accumulate. The scratchpad is the workbench where partial work piles up until it's ready to assemble.
Making reasoning visible and recoverable. Because the notes are written out, the agent (and you) can see why it chose an action. If a step fails, the agent can look back at what it tried and adjust, rather than blindly retrying the same thing.

This is the layer that sits between the two things people usually talk about. Short-term context (the raw context window) is just "whatever text fits in this one call." Long-term memory is "things the agent remembers across totally separate sessions." The scratchpad is the middle: the deliberate working state for the current job — actively curated, not just whatever happens to be in the buffer, and not meant to outlive the task.

How it works

Mechanically, a scratchpad is text that the agent's loop carries forward and the model keeps reading and rewriting. There is no special hardware and no hidden memory — it's plain content that flows back into the prompt on every step. The loop looks like this:

// The scratchpad inside the agent loop

Read goal + scratchpadThink / decide next actionTake action (tool call)Observe resultUpdate scratchpad↺ repeat

Each pass around the loop, the model is shown the goal plus the current scratchpad, decides what to do, acts, sees the result, and writes the outcome back into the scratchpad. Next pass, that updated scratchpad is what it reads. This is the same write-it-down-then-act rhythm as the ReAct pattern, where the agent interleaves reasoning notes with actions — the reasoning trace is a scratchpad.

Where the scratchpad actually lives

There's no single right home for it. The three common forms trade off simplicity against how much state you need to hold:

Form	What it is	Best when
In-context notes	The agent's own reasoning text, kept in the running conversation	Short tasks; reasoning is the work
Structured to-do list	An explicit list of steps with done / pending status	Multi-step plans you want to track and check off
External file	A real file (e.g. `notes.md`, `plan.md`) the agent reads and rewrites	Long tasks; results too big to keep re-sending every step

The in-context version is the default and the simplest: the notes are just part of the prompt. But it has a hard ceiling — everything you keep there competes for limited context space, and re-sending a growing pile of notes every step gets slow and expensive. That's why longer-running agents push the scratchpad out of the context window into a file, and only pull the relevant slice back in when they need it.

A scratchpad in practice

Here's the shape of an agent's working text after a few steps of a research task. Notice it holds three things at once: the plan, the running findings, and the next action.

scratchpad mid-tasktext

GOAL: Summarize our Q3 outage reports into one page.

PLAN:
[x] 1. List all outage report files
[x] 2. Read each report
[ ] 3. Extract root cause + duration from each
[ ] 4. Write the summary

FINDINGS SO FAR:
- report-07.md: DB connection pool exhausted, 42 min
- report-09.md: bad deploy, rolled back, 18 min
- report-11.md: (not read yet)

NEXT: read report-11.md

On the next step the model reads exactly this, sees that report-11 is unread, fetches it, appends the finding, and ticks the box. The scratchpad is doing all the remembering; the model just reads and updates it.

Scratchpad vs context window vs long-term memory

These three are constantly mixed up because they overlap physically — a scratchpad usually lives in the context window, and long-term memory gets loaded into the context window when needed. But they play different roles, and keeping them straight is the key to designing an agent that doesn't lose the plot.

// Three layers of agent state

Context window

The raw text the model sees this call
A fixed-size buffer, measured in tokens
Includes everything: prompt, tools, scratchpad
Wiped between unrelated calls

Scratchpad

Curated working notes for THIS task
Plans, intermediate results, to-dos
Actively written and pruned by the agent
Discarded when the task ends

Long-term memory

Facts kept ACROSS separate tasks
Stored outside the window (DB, files)
Retrieved when relevant, not always loaded
Survives the task, even the session

A quick way to tell them apart: the context window is the whole desk the model can see right now. The scratchpad is the one notepad on that desk the agent is actively writing on for the current job. Long-term memory is the filing cabinet across the room — not on the desk, but reachable when the agent needs to pull a folder. The scratchpad gets thrown away when the task is done; the filing cabinet keeps what's worth keeping. For the full taxonomy, see what is agent memory.

The scratchpad and context limits

Here is the tension that makes scratchpads interesting. The longer an agent works, the more its scratchpad grows — more findings, more checked-off steps, more observed tool output. But the context window doesn't grow with it. Sooner or later the running notes plus the original goal plus the latest tool result no longer fit, or fit but cost too much to re-send every step.

This is where the scratchpad collides with context compaction — the practice of shrinking the running state so it stays under the limit. A few common moves:

Summarize finished work. Once steps 1–3 are done, replace their verbose detail with a one-line summary. The agent doesn't need the full transcript of a search it already extracted the answer from.
Offload to a file. Move bulky intermediate results (a long document, raw scraped text) out of the context into a file, and keep only a pointer plus a short note in the scratchpad. Re-read the file only if a later step needs it.
Prune the dead. Drop completed to-dos and tool outputs that no future step will use. A scratchpad is a working surface, not an archive — old notes are clutter, not value.
Keep the plan, compress the trace. The plan and current findings are precious; the blow-by-blow of how you got each finding usually isn't. Protect the former, compress the latter.

A well-designed agent treats its scratchpad as something to maintain, not just append to. The skill of keeping the right notes — and aggressively discarding the rest — is a big part of what separates an agent that finishes a long task from one that runs out of room and loses its place. This curation is also closely tied to agent planning: a good plan written to the scratchpad gives the agent a stable spine to hold onto even as older details get compressed away.

Going deeper

Once the basics click, the interesting questions are about how to manage the scratchpad well, because a sloppy one quietly degrades an otherwise capable agent.

The file-as-memory pattern. For long-horizon work, the most robust approach is to make a real file the source of truth — a plan.md or progress.md the agent reads at the start of every step and rewrites at the end. This survives even if the conversation gets compacted or truncated, because the file isn't subject to the context window at all. The context holds only a fresh read of the file plus the current step; the file holds the durable state. It turns a fragile in-context scratchpad into something that can outlast many rounds of compaction.

Structured vs freeform notes. Freeform reasoning text is flexible but easy for the model to ignore or contradict. A structured scratchpad — an explicit to-do list with statuses, or a small JSON object of fields — is harder to drift away from because each item is a clear, checkable commitment. Many production agents push toward structure for exactly this reason: a checkbox is a stronger anchor than a paragraph.

Multi-agent scratchpads. When one orchestrator delegates to sub-agents, each sub-agent usually gets its own scratchpad scoped to its sub-task, and reports back only a clean result — not its messy working notes. This keeps the orchestrator's context from filling up with the internal scribbles of every helper. Deciding what crosses that boundary (the result) versus what stays private (the scratchpad) is a core design choice in agentic workflows.

The honest open problem is that there's no perfect rule for what to keep. Keep too little and the agent forgets a constraint and undoes earlier work; keep too much and it runs out of room and slows down. The whole art of long-running agents lives in this balance, which is why scratchpad management, the agent loop, and compaction are really one connected problem rather than three separate ones. Get working memory right and a modest model can finish surprisingly hard tasks; get it wrong and even a powerful model wanders.

FAQ

What is an agent scratchpad in AI?

It's the working memory an AI agent uses while solving a single task — a running area of plans, notes, and intermediate results that the agent reads and rewrites on each step. Because the underlying model is stateless, the scratchpad is how the agent remembers its own plan and progress from one step to the next.

What is the difference between a scratchpad and long-term memory?

A scratchpad holds in-task state and is thrown away when the current task ends; long-term memory holds facts across separate tasks and sessions. A simple test: if the information should be forgotten the moment this job finishes, it's scratchpad; if it should still matter next time the agent runs, it's long-term memory.

Is the scratchpad the same as the context window?

No, but they're related. The context window is the entire block of text the model sees on one call — prompt, tools, and the scratchpad all live inside it. The scratchpad is the specific, curated chunk of working notes the agent actively maintains for the current task. The window is the desk; the scratchpad is the notepad on it.

Where does an agent keep its scratchpad?

Three common places: inline in the conversation as reasoning text, as a structured to-do list with done/pending status, or in an external file the agent reads and rewrites (like plan.md). Short tasks usually use in-context notes; long tasks push the scratchpad into a file so it survives context limits and compaction.

Why does my agent forget its plan halfway through a long task?

Usually because the scratchpad grew past the context window or was compacted away. Fixes include summarizing finished steps, moving bulky results into a file and keeping only a pointer, pruning completed to-dos, and protecting the plan itself from compression so the agent always has a stable spine to re-read.

What is the difference between a scratchpad and a to-do list?

A to-do list is one form of scratchpad — a structured one. The scratchpad is the broader idea of in-task working memory, which can also hold intermediate findings and freeform reasoning, not just a checklist. Many agents use a to-do list as the backbone of their scratchpad because checkable items are easier to stay anchored to than prose.

// In plain English

// Why it matters

// How it works

Where the scratchpad actually lives

A scratchpad in practice

// Scratchpad vs context window vs long-term memory

// The scratchpad and context limits

// Going deeper

// FAQ

// Further reading

// Related

In plain English

Why it matters

How it works

Scratchpad vs context window vs long-term memory

The scratchpad and context limits

Going deeper

FAQ

Further reading

Related