In plain English
When you call an LLM, it gives you back a string. That string might look like JSON, or it might be a paragraph, or it might ignore your instructions entirely and return something you can't parse. For a quick demo, that's fine. For production code that routes a customer refund or triggers a database write, it's a liability.
PydanticAI is a Python agent framework built by the same team behind Pydantic — the validation library that OpenAI, Anthropic, and Google all use internally. Its core premise: AI agents deserve the same type guarantees as the rest of your Python code. You declare what shape the output should be (a Pydantic model), what tools the agent can call (decorated functions with full type hints), and what external dependencies it needs (injected via a typed context). The framework handles the rest: schema generation, JSON validation, retry on bad output, and observability.
The clearest analogy is FastAPI for agents. FastAPI popularised the idea of declaring HTTP request/response shapes with Python types and letting the framework generate validation, docs, and error handling automatically. PydanticAI applies the same philosophy to LLM-powered agents: declare everything with types, let the framework enforce them at runtime.
Why it matters
Most agent frameworks treat LLM output as a string to be parsed somewhere downstream. This creates a category of bugs that only appear at runtime, often in production: malformed JSON, missing required fields, wrong data types, injected content that breaks downstream logic. PydanticAI eliminates the whole category by making the LLM's contract explicit before the call and validated immediately after.
Three concrete benefits for production teams
- Bugs move from runtime to write-time. Your IDE flags type mismatches when you write the code, not when a customer hits the broken path.
- Test dependencies in isolation. Because tools receive typed, injected dependencies rather than global state, you can swap a real database client for a mock in tests with a single line change.
- Model swaps are trivial. PydanticAI supports OpenAI, Anthropic, Google, xAI, Bedrock, Cohere, Mistral, Groq, Ollama, and more. Switching providers means changing one string — no code rewrites.
This matters because AI agents that call tools — look up databases, write files, call APIs — are doing things that are hard or impossible to undo. Untyped, unvalidated data flowing into those tools is exactly the kind of silent failure that causes real-world incidents. PydanticAI is designed around that threat model.
How it works
A PydanticAI application has four main building blocks: the Agent, Tools, Dependencies, and Structured Output. Each maps directly to a Python type or decorator.
Defining an agent
An Agent is typed with two generics: the dependency type and the output type. Both are ordinary Python types — dataclasses, Pydantic models, or primitives.
from pydantic import BaseModel
from pydantic_ai import Agent, RunContext
class OrderDeps:
db: object # your real DB client here
class OrderSummary(BaseModel):
order_id: str
status: str
total_usd: float
agent: Agent[OrderDeps, OrderSummary] = Agent(
'anthropic:claude-sonnet-4-6',
deps_type=OrderDeps,
output_type=OrderSummary,
system_prompt='You are an order support assistant.',
)Adding typed tools
Tools are regular Python functions decorated with @agent.tool. The function's docstring becomes the description the LLM sees; the parameter types become the JSON schema for tool arguments. PydanticAI validates arguments before your code ever runs.
@agent.tool
async def lookup_order(ctx: RunContext[OrderDeps], order_id: str) -> dict:
"""Look up an order by its ID and return status and total."""
return await ctx.deps.db.get_order(order_id)Structured output and automatic retry
When the LLM returns its final response, PydanticAI runs it through Pydantic validation against output_type. If validation fails — a field is missing, a type is wrong, a value violates a constraint — the error message is sent back to the LLM as a follow-up message, giving it a chance to self-correct. You configure how many retries to allow.
import asyncio
async def main():
deps = OrderDeps(db=my_db_client)
result = await agent.run('What is the status of order ORD-9821?', deps=deps)
# result.output is a fully validated OrderSummary instance
print(result.output.status) # 'shipped'
print(result.output.total_usd) # 142.50 (a real float, not a string)
asyncio.run(main())Key design choices
Dependency injection over global state
Many frameworks rely on global singletons or environment variables to wire in services like database clients or API keys. PydanticAI uses formal dependency injection instead: you declare a deps_type for the agent, pass an instance of it at call time, and every tool can access it through RunContext. This makes agents unit-testable — swap the real client for a test double without patching globals.
Model-agnostic by design
The agent's model is a string identifier like 'openai:gpt-4o' or 'anthropic:claude-sonnet-4-6'. Swapping providers is a one-line change because all model-specific adapter logic lives inside PydanticAI, not your code. Supported providers include OpenAI, Anthropic, Google Gemini, xAI Grok, Mistral, Cohere, Groq, Amazon Bedrock, Hugging Face, Ollama, and OpenRouter.
Observability with Pydantic Logfire
PydanticAI integrates natively with Pydantic Logfire, an OpenTelemetry-based observability platform. Every agent run, tool call, and model exchange is automatically traced — you see exactly what the model sent, what it returned, which tools fired, and how long each step took. This is built-in, not bolted on.
Human-in-the-loop and durable execution
Tools can be marked as requiring explicit human approval before they run. Combined with durable execution support (the framework can checkpoint and resume agent runs across failures), PydanticAI is designed for the kind of long-running, high-stakes workflows where you can't afford silent failures or lost progress.
- Output is a raw string
- Parse JSON manually, handle exceptions
- Tool args validated by hand (or not at all)
- Swap model = rewrite adapter code
- Test requires patching globals
- Bugs surface at runtime in production
- Output is a typed Pydantic model
- Validation runs automatically, retries on failure
- Tool args validated from type hints
- Swap model = change one string
- Test via injected mock deps
- Type errors caught at write-time by IDE
Going deeper
Once you're comfortable with the basics, PydanticAI has several advanced capabilities worth exploring.
Streaming structured output
PydanticAI can stream the model's output token-by-token while simultaneously running partial Pydantic validation. This lets your UI show progressive results without sacrificing type safety — the first valid partial object arrives as fast as possible.
Graph-based workflows
For complex multi-step pipelines, PydanticAI includes a graph API where nodes are typed Python functions connected by type-checked edges. This is the right tool when your workflow has branching logic, loops, or parallel paths that a single agent call can't capture. Think of it as a statically-typed alternative to something like LangGraph.
MCP and multi-agent protocols
PydanticAI supports the Model Context Protocol (MCP) and the Agent2Agent (A2A) protocol, enabling agents to expose their tools to MCP-compatible clients and to delegate to other agents in a typed, interoperable way.
When to choose PydanticAI
- You already use Pydantic for data validation in your stack — PydanticAI slots in naturally.
- You need production-grade reliability — validated outputs, typed tools, and formal dependency injection all reduce error surface area.
- You want testable agents — dependency injection makes unit testing agent logic without real API calls straightforward.
- You need multi-model flexibility — the model-agnostic design prevents vendor lock-in.
- You're building structured-output pipelines — if your agent's job is to extract or transform data into a defined schema, PydanticAI is purpose-built for that.
FAQ
Is PydanticAI the same as the Pydantic library?
No. Pydantic is the standalone data-validation library (used widely for parsing JSON into Python objects). PydanticAI is a separate agent framework built on top of Pydantic by the same team. You can use Pydantic without PydanticAI, but PydanticAI uses Pydantic heavily for schema generation and output validation.
Does PydanticAI only work with OpenAI?
No — it's explicitly model-agnostic. Supported providers include OpenAI, Anthropic, Google Gemini, xAI Grok, Mistral, Cohere, Groq, Amazon Bedrock, Hugging Face, Ollama (for local models), and OpenRouter. Switching providers is a one-string change.
What happens when the LLM returns output that fails Pydantic validation?
PydanticAI automatically sends the validation error back to the model as a follow-up message, asking it to correct the output. You configure a retries limit on the agent. If the model still fails after that many attempts, a typed exception is raised so you can handle it explicitly.
How is PydanticAI different from LangChain?
LangChain is a much broader framework (chains, memory, retrieval, many integrations) that evolved organically and historically relied on string-based data flow. PydanticAI is narrower and more opinionated: it focuses on typed agents with validated I/O and formal dependency injection. Teams often describe PydanticAI as requiring less boilerplate for production agent use cases, though the two frameworks have different sweet spots.
Can I use PydanticAI for streaming responses?
Yes. PydanticAI supports streaming structured output — it delivers partial validated objects as the model generates tokens, rather than waiting for the full response. This is useful for chatbot UIs where you want progressive display without giving up type safety.
Do I need to know Pydantic before using PydanticAI?
Basic Pydantic knowledge helps a lot — you'll define output schemas as Pydantic BaseModel subclasses. If you've already used FastAPI (which also uses Pydantic), the mental model transfers directly. If not, Pydantic's own docs are short and approachable.