Agent Framework Mental Models: Graphs, Loops, and Crews

Q: What mental model does LangGraph use?

LangGraph uses the **graph model**: your agent workflow is a directed graph where nodes are Python functions that read and write a shared typed state object, and edges (including conditional edges) define what runs next. This gives you explicit control over branching, parallel execution, and crash-safe checkpointing, at the cost of more upfront boilerplate than simpler frameworks.

Q: Which mental model should beginners start with?

Start with the **loop model** — it is the closest to how LLMs actually work, has no framework-specific vocabulary to learn, and makes bugs obvious. Build a working agent with a raw API call and a while-loop before reaching for any framework. Once the loop becomes unwieldy (too many branches, state growing complex), you will know exactly which framework feature you need and why.

Learn the three mental models behind every agent framework — graphs, loops, and crews — so you can map any new tool onto one of them instantly.

INTERMEDIATE14 MIN READUPDATED 2026-06-12

In plain English

Every agent framework is a different way of thinking about agents before it is a library. LangGraph asks you to draw a flowchart. The bare agent loop asks you to write a while loop. CrewAI asks you to cast a TV production crew. These three pictures — the graph, the loop, and the crew — are not just API styles; they are mental models that determine how you structure problems, what you name things, and where bugs tend to hide.

Agent Framework Mental Models — diagram — Agent Framework Mental Models — getzep.com

Think of it like navigation. One person thinks about a road trip as a turn-by-turn graph (nodes are intersections, edges are roads). Another person thinks of it as a recursive loop — while destination not reached: evaluate position, take next best action. A third person thinks of it as a team assignment — the driver drives, the navigator reads the map, the co-pilot looks for hazards. All three reach the same destination; they just frame the problem differently. When you learn a new agent framework, the first question is: which of these three pictures is it using?

Why mental models matter more than APIs

The framework's mental model determines how you decompose problems. If you reach for LangGraph and your problem is naturally a linear sequence of role handoffs, you spend days wrestling with graph edges and node state to express something CrewAI would represent in 15 lines. If you reach for a bare agent loop for a workflow that has 12 conditional branches and requires resuming after a crash, you end up hand-rolling exactly what LangGraph gives you for free — but worse.

Mental models also surface different kinds of bugs. In the loop model, the classic bug is an infinite loop — the agent keeps calling tools without converging. In the graph model, the classic bug is an unreachable node — a state the agent enters from which there is no edge to END. In the crew model, the classic bug is task contamination — a downstream agent receives context from a previous agent that silently biases its output. Knowing the model tells you where to look when things go wrong.

Finally, mental models determine how you communicate with your team. Saying "we need to add a conditional edge from the planner to the retriever" means something precise in LangGraph. "We need a new task between researcher and writer" means something precise in CrewAI. Speaking the framework's native vocabulary reduces ambiguity in code reviews and architecture discussions.

The three mental models

Three mental models dominate the agent framework landscape. Understanding each one on its own terms — not just as a comparison — is the fastest path to fluency in any specific framework.

// Three agent mental models

Loop

Core unit: a while-loop iteration
State: a growing message history
Control: LLM decides next action
Vocab: think, act, observe, repeat
Exemplars: ReAct pattern, OpenAI Agents SDK
Bug pattern: infinite loop

Graph

Core unit: node + edge
State: a typed shared state object
Control: explicit conditional edges
Vocab: nodes, edges, StateGraph, checkpoints
Exemplars: LangGraph
Bug pattern: unreachable node

Crew

Core unit: role-based agent + task
State: task outputs passed between agents
Control: sequential or hierarchical delegation
Vocab: agent, role, task, crew, process
Exemplars: CrewAI
Bug pattern: task context contamination

Mental model 1: The loop

The loop model is the closest to how LLM agents actually work under the hood. Every agent is a while loop: send the current message history to the LLM, get a response, check whether that response is a final answer or a tool call, execute the tool if needed, append the result to history, and repeat. The loop terminates when the LLM produces a final text answer or a maximum-step limit is hit.

The formal version of this model is the ReAct pattern (Reasoning + Acting), introduced in a 2022 paper. The LLM alternates between generating a thought (reasoning about what to do next), an action (calling a tool), and an observation (receiving the tool's result). This cycle continues until the LLM is confident enough to emit a final answer.

Bare agent loop (conceptual pseudocode)python

messages = [{"role": "user", "content": user_input}]

while True:
    response = llm.call(messages, tools=available_tools)

    if response.is_final_answer:
        return response.text

    # LLM decided to call a tool
    tool_result = execute_tool(response.tool_call)
    messages.append({"role": "tool", "content": tool_result})
    # loop: send updated history back to LLM

Frameworks built on the loop model — including the OpenAI Agents SDK and raw API-based approaches — wrap this core with robustness features: max-iteration guards, tool error handling, streaming responses, and conversation history management. The mental model stays a loop; the framework just makes the loop production-grade.

Mental model 2: The graph

The graph model, exemplified by LangGraph, reimagines agent workflows as directed graphs — the same data structure used in compilers, state machines, and workflow engines. Each node is a Python function that reads from a shared state object, does computation (calls an LLM, invokes a tool, transforms data), and writes back to state. Each edge defines what happens next: a plain edge always goes to the same node; a conditional edge calls a router function that returns the name of the next node at runtime.

The shared state object is the most important concept in the graph model. It is a typed dictionary (a Python TypedDict) that all nodes read from and write to. This is different from the loop model's message history, which is an append-only list of conversational turns. The graph model's state can hold anything: intermediate results, flags, structured data, lists of sub-tasks. Because state is typed and explicit, it is much easier to reason about what has happened and what comes next.

LangGraph node and conditional edge (illustrative)python

from typing import TypedDict
from langgraph.graph import StateGraph, END

class AgentState(TypedDict):
    query: str
    tool_results: list[str]
    confidence: float
    final_answer: str

def retriever_node(state: AgentState) -> AgentState:
    # fetch documents, add to state
    results = search(state["query"])
    return {"tool_results": results}

def router(state: AgentState) -> str:
    """Conditional edge: which node comes next?"""
    if state["confidence"] >= 0.8:
        return "answer"
    return "retriever"   # loop back for more evidence

graph = StateGraph(AgentState)
graph.add_node("retriever", retriever_node)
graph.add_node("answer", answer_node)
graph.add_conditional_edges("retriever", router, {"retriever": "retriever", "answer": "answer"})
graph.add_edge("answer", END)

Because the entire workflow is an explicit graph, LangGraph can do things the loop model cannot easily do: checkpoint the full state at any node so that a crashed agent can resume from exactly that point, replay from any past state for debugging, and pause execution at a designated node to ask a human for approval before continuing.

Mental model 3: The crew

The crew model, embodied by CrewAI, thinks about agents the way a TV production company thinks about staff: you hire specialists, assign them roles, and give them tasks to execute in sequence. There is no explicit state graph and no hand-rolled while loop. You declare agents (a Researcher, a Writer, a Reviewer), give each one a role, goal, backstory, and tool set, then assemble them into a Crew with an ordered list of Task objects.

In the default sequential process, tasks execute in the order you define them. Each task's output becomes available as context to the next task automatically. In the optional hierarchical process, you designate a manager agent that uses its own LLM to decide which worker agent should handle each task, enabling dynamic delegation. You do not write routing logic; the manager reasons about it.

CrewAI crew definition (illustrative)python

from crewai import Agent, Task, Crew, Process

researcher = Agent(
    role="Senior Research Analyst",
    goal="Find accurate, up-to-date information on the given topic",
    backstory="You have 10 years of research experience and a nose for primary sources.",
    tools=[web_search_tool],
)

writer = Agent(
    role="Content Writer",
    goal="Turn research into a clear, readable summary",
    backstory="You write for a technical audience that values precision over fluff.",
)

research_task = Task(
    description="Research the latest developments in {topic}",
    expected_output="A bullet-point summary of 5-8 key findings with sources",
    agent=researcher,
)

write_task = Task(
    description="Write a 300-word article based on the research",
    expected_output="A polished article with an intro, body, and conclusion",
    agent=writer,
)

crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, write_task],
    process=Process.sequential,
)

result = crew.kickoff(inputs={"topic": "LLM agent frameworks"})

The crew model's strength is how quickly it maps to real-world team structures. If you can describe your workflow as "a researcher gathers facts, a strategist decides what to do with them, a builder executes the plan, and a reviewer checks the output," CrewAI almost writes itself. The crew model's weakness is the same: it assumes your workflow is a team of sequential specialists. When you need fine-grained branching, parallel execution, or crash-safe checkpointing, you end up fighting the abstraction.

Tradeoffs at a glance

Every mental model makes specific tradeoffs between expressiveness, learning curve, debuggability, and speed to first working prototype. The table below captures the key dimensions.

Dimension	Loop	Graph (LangGraph)	Crew (CrewAI)
Learning curve	Lowest — just a while-loop	Highest — must learn graph concepts, state typing, and edge routing	Low — maps to intuitive team metaphors
Expressiveness	High — any logic fits in a loop	Very high — arbitrary branching, parallel nodes, cycles	Medium — sequential or hierarchical team flows
State management	Append-only message history	Typed shared dict, full checkpoint support	Task outputs passed forward; no mid-workflow checkpoints
Crash recovery	Typically manual	Built-in checkpointing; resume from any node	Restart from the beginning of the crew run
Debugging	Print statements on message list	Time-travel debug via saved checkpoints; LangSmith tracing	Inspect task outputs; CrewAI dashboard at enterprise tier
Speed to prototype	Fast for simple cases	Slow — graph wiring adds boilerplate	Very fast — 20-line pipelines are common
Best for	Single-agent tool loops, chatbots	Complex branching workflows, human-in-the-loop, production systems	Multi-agent team pipelines, content workflows, research tasks

Mapping real frameworks onto the models

Once you can recognize the three mental models, it becomes easy to categorize any framework you encounter — including ones that were released after this article was written. Most frameworks are not pure instances of one model; they blend elements. The useful question is: what is the dominant abstraction, and what vocabulary does the framework use to express it?

// Major frameworks by dominant mental model

Agent Frameworkpick a dominant model

Loop modelOpenAI Agents SDK, Claude Agent SDK, bare API calls, ReAct implementations

Graph modelLangGraph (primary), LangChain Expression Language (partial)

Crew modelCrewAI, AutoGen / AG2 (crew + conversational loop hybrid)

The OpenAI Agents SDK: loop with guardrails

The OpenAI Agents SDK (released March 2025, a production successor to Swarm) is explicitly a loop model. Agents are defined with instructions and tools; the SDK manages the conversation loop, tool execution, and streaming. The one distinctive concept is the handoff — a tool call that transfers control to a different agent, bringing the current conversation state with it. This makes it possible to build multi-agent systems without a graph, by chaining agents through programmatic handoffs. The dominant vocabulary is: Agent, Runner, handoff, guardrail.

AutoGen / AG2: crew with a conversational backbone

AutoGen (and its community continuation AG2, active since Microsoft shifted focus in 2025) blends the crew and loop models. You define ConversableAgent objects with roles, then put them into a group chat or a two-agent conversation. Agents take turns speaking — the "conversation" is the control flow. This makes AutoGen particularly strong for human-in-the-loop scenarios, where a human is literally one of the agents in the chat. The dominant vocabulary is: ConversableAgent, GroupChat, UserProxyAgent, AssistantAgent.

How to read any new framework

When you encounter a framework you have not seen before, ask three questions: (1) What is the core unit? (a node function, a while-loop iteration, a role-based agent) (2) How is state passed between steps? (typed shared dict, append-only message list, task output context) (3) What controls branching? (conditional edges, LLM tool calls, a manager agent's delegation). The answers place the framework in the map above within minutes, before you have written a single line of code.

Going deeper

Once you are comfortable with the three mental models, the next level is understanding where they break down — and what modern frameworks are doing to extend them.

When graphs get unwieldy

LangGraph's graph model earns its complexity for workflows with 8-15+ nodes, conditional branches, and crash-recovery requirements. But many teams find that as a graph grows beyond 20 nodes it becomes difficult to reason about — the visual diagram of the graph itself becomes a source of confusion rather than clarity. At that scale, teams often break the single graph into a hierarchy of subgraphs (LangGraph supports this natively with the add_node + compiled-graph approach), where each subgraph is itself a well-understood loop or crew. The mental models are not mutually exclusive; sophisticated production systems often combine all three.

Stateful loops vs. stateless graphs

A common confusion is that stateful means graph model. In fact, loops can be stateful too — the message history IS the state. The difference is the structure of the state: the loop model's state is a linear sequence (conversation history), while the graph model's state is a structured, typed dict that can hold arbitrary named fields. For simple workflows, linear state is all you need. When you find yourself adding multiple ad-hoc fields to your message history — stuffing structured data into assistant messages, parsing JSON out of tool outputs — that is the signal that you have outgrown the loop model's state structure and the graph model's typed state dict would clean things up.

The emerging fourth model: choreography

A fourth mental model is gaining traction in 2025-2026: choreography, where agents communicate via a shared event bus rather than being centrally orchestrated. No single node, loop, or manager controls the flow; each agent subscribes to event types, reacts independently, and publishes new events that other agents pick up. Frameworks like Dapr Agents and event-driven patterns on top of LangGraph implement this model. It scales better to very large multi-agent systems where centralized orchestration becomes a bottleneck, but debugging is significantly harder because causality is implicit in the event stream rather than explicit in graph edges or task lists.

How mental models affect prompt engineering

The mental model you choose influences how you write prompts. In the loop model, one agent prompt does everything — it must handle tool selection, reasoning, and answer generation in a single system prompt that grows longer as the task grows. In the graph model, each node has a focused, small system prompt because the node only does one thing; routing logic lives in code, not in the LLM. In the crew model, each agent's system prompt is its role description — detailed backstory and goal statements that prime the LLM to behave like a specialist. The same content work — research, synthesis, and writing — looks completely different in each model's prompt architecture, which is why switching mental models often means rewriting prompts from scratch.

FAQ

What mental model does LangGraph use?

LangGraph uses the graph model: your agent workflow is a directed graph where nodes are Python functions that read and write a shared typed state object, and edges (including conditional edges) define what runs next. This gives you explicit control over branching, parallel execution, and crash-safe checkpointing, at the cost of more upfront boilerplate than simpler frameworks.

What is the ReAct loop and which frameworks use it?

ReAct (Reasoning + Acting) is the canonical loop mental model for agents: the LLM alternates between reasoning about what to do, calling a tool (acting), observing the result, and repeating until it can answer. Almost every agent framework implements this loop internally. Frameworks like the OpenAI Agents SDK and bare API-based approaches expose this loop directly; LangGraph and CrewAI use it inside nodes and tasks but add additional structure on top.

Is CrewAI better than LangGraph?

Neither is universally better — they use different mental models suited to different problems. CrewAI's crew model is faster to prototype role-based multi-agent pipelines, with a learning curve close to zero. LangGraph's graph model gives you precise control over branching, checkpointing, and complex state, but requires more setup. If your workflow looks like a team of specialists passing work down a chain, start with CrewAI. If it has many conditional branches or needs crash-recovery, use LangGraph.

Can I mix mental models in the same agent system?

Yes, and production systems often do. A common pattern is to use a LangGraph graph as the outer orchestrator (for its state management and checkpointing), with individual nodes that internally run a CrewAI crew or a bare agent loop. LangGraph explicitly supports subgraphs, making it possible to encapsulate a full crew or loop as a single node in a larger graph.

Which mental model should beginners start with?

Start with the loop model — it is the closest to how LLMs actually work, has no framework-specific vocabulary to learn, and makes bugs obvious. Build a working agent with a raw API call and a while-loop before reaching for any framework. Once the loop becomes unwieldy (too many branches, state growing complex), you will know exactly which framework feature you need and why.

Why does the choice of mental model affect debugging?

Because each model has a different causal structure, bugs manifest differently. In the loop model, you trace through a growing message list to see what the LLM was told and what it responded. In the graph model, you inspect which node produced which state transition — LangGraph's checkpoint system lets you replay from any past state. In the crew model, you check each task's expected and actual output to find where context was dropped or contaminated.

// In plain English

// Why mental models matter more than APIs

// The three mental models

Mental model 1: The loop

Mental model 2: The graph

Mental model 3: The crew

// Tradeoffs at a glance

// Mapping real frameworks onto the models

The OpenAI Agents SDK: loop with guardrails

AutoGen / AG2: crew with a conversational backbone

How to read any new framework

// Going deeper

When graphs get unwieldy

Stateful loops vs. stateless graphs

The emerging fourth model: choreography

How mental models affect prompt engineering

// FAQ

// Further reading

// Related

In plain English

Why mental models matter more than APIs

The three mental models

Tradeoffs at a glance

Mapping real frameworks onto the models

Going deeper

FAQ

Further reading

Related