Simon Willison · 2026-07-03 · notable
Simon Willison — let Fable delegate coding tasks to cheaper models
Simon Willison shows how Fable, running as the main Claude Code loop, can spawn subagents on Sonnet for substantive coding and Haiku for trivial edits — using Fable's judgement to route work and cut cost without losing quality.

Simon writes down the tiering rule he keeps giving Fable: judge the task, then hand it to Sonnet or Haiku unless it truly needs the top model.
What is it?
'Fable's judgement' is Simon Willison's July 3 post on how to run Claude Code with Fable as the main model and let it decide when to spawn a cheaper subagent. Rather than hard-coding rules like 'always run tests', Simon writes memory-file instructions that ask Fable to weigh each task and choose the right tool.
How does it work?
Claude Code's Agent tool accepts a model override, so Fable can dispatch a subagent with `model: sonnet` for substantive implementation or `model: haiku` for trivial mechanical edits, then read the returned diff back into its own loop. Simon stores the routing policy in a per-project memory file so it applies to every session in that repo.
Why does it matter?
Fable tokens are the expensive ones in a coding session. Most edits are mechanical and don't need a frontier model, but until now users hand-picked the tier per task. Delegating that decision to the model — with a written policy — pushes cost down without the user thinking about it, and keeps judgement work in the top loop where it belongs.
Who is it for?
Claude Code power users, agent developers, teams watching token spend
Try it
Add a memory file at ~/.claude/projects/<project>/memory.md with the tiering rule and let Fable route from there.