AI/TLDR

What Is Instructor? Structured Outputs for Any Model

Learn how one small library turns any model's messy text into validated Pydantic objects, with automatic retries when the output doesn't fit.

INTERMEDIATE11 MIN READUPDATED 2026-06-12

In plain English

LLMs are text machines. You send them a prompt and they send back a string. If you need that string to be a customer record, a list of extracted entities, or a structured report with specific fields, you have to parse it yourself — and hope the model followed your instructions perfectly. Most of the time it does. Sometimes it doesn't, and your parser crashes at 2 a.m.

Instructor is a small Python library (also available in TypeScript, Go, Ruby, and more) that wraps your existing LLM client and adds one superpower: the response comes back as a validated Pydantic model, not a raw string. You define the shape of the output once in Python types. Instructor sends that schema to the model, parses what comes back, validates it with Pydantic, and — if validation fails — automatically retries by feeding the error message back to the model so it can self-correct.

The analogy is a strict JSON API contract. Imagine you hired a data-entry clerk (the LLM) to fill out a form. Without Instructor, the clerk can hand back the form any way they like — maybe some fields are missing, maybe the date is in the wrong format. With Instructor, the form is a typed schema; if the clerk fills it out wrong, you hand it back with a red pen explaining every error and ask them to try again. You get back the form only when every field is valid.

Why it matters

Raw LLM output causes a whole category of production bugs that never show up during development: the model adds an extra explanation paragraph before the JSON, uses null where you needed an empty list, or invents a field name that does not match your schema. These bugs are non-deterministic and intermittent, which makes them the worst kind to debug.

The four problems Instructor solves

  • Inconsistent output format — even with careful prompting, models occasionally drift from your requested JSON structure. Instructor validates every response before it reaches your application code.
  • Manual retry boilerplate — if you catch a ValidationError and want to retry, you need to rewrite the request with the error context. Instructor does this automatically with a configurable retry limit (default: 3).
  • Provider lock-in — different providers have different APIs for structured output. Instructor exposes one uniform response_model parameter regardless of whether the backend is OpenAI, Anthropic, Google Gemini, Mistral, Ollama, or a local model.
  • Schema-to-prompt overhead — Instructor generates the JSON schema from your Pydantic model and injects it into the request in whatever format the provider expects (function call, tool call, or native JSON schema mode), so you never write schema boilerplate by hand.

For data extraction pipelines, document parsing, classification systems, and any workflow where downstream code expects a specific data shape, Instructor turns a probabilistic text generator into something that behaves much more like a typed API endpoint.

How it works

Instructor uses a patch pattern: it wraps your existing LLM client object and intercepts chat.completions.create() (or the provider equivalent). The only change to your code is adding response_model=YourPydanticClass to the call. Everything else — schema generation, mode selection, parsing, validation, retry — happens inside the library.

Patching your client

You patch the client once at startup using instructor.from_openai(), instructor.from_anthropic(), or the unified instructor.from_provider() helper. The patched client has the same interface as the original, so no other code needs to change.

Basic Instructor usagepython
import instructor
from openai import OpenAI
from pydantic import BaseModel

# Patch once at startup
client = instructor.from_openai(OpenAI())

# Define the shape you want
class UserInfo(BaseModel):
    name: str
    age: int
    email: str

# Call the model — response comes back as a UserInfo instance
user = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=UserInfo,
    messages=[
        {"role": "user", "content": "Extract: John Doe, 34, john@example.com"}
    ],
)

print(user.name)   # 'John Doe'
print(user.age)    # 34
print(user.email)  # 'john@example.com'

The retry loop

When Pydantic raises a ValidationError, Instructor appends both the model's previous (invalid) response and the full validation error message to the conversation history, then calls the LLM again. The model sees exactly what it got wrong and self-corrects. You can control the number of attempts with max_retries.

Custom retry countpython
from pydantic import BaseModel, field_validator

class Product(BaseModel):
    name: str
    price_usd: float

    @field_validator('price_usd')
    @classmethod
    def price_must_be_positive(cls, v: float) -> float:
        if v <= 0:
            raise ValueError('price must be a positive number')
        return v

# Instructor retries up to 5 times if validation fails
result = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Product,
    max_retries=5,
    messages=[{"role": "user", "content": "Parse: Widget X, free"}],
)

Extraction modes

Instructor selects an extraction mode automatically based on the provider, or you can set it explicitly. The mode controls how the schema is communicated to the model.

ModeHow schema is sentBest for
TOOLSOpenAI-style tool/function call argumentsOpenAI, Groq, Fireworks, most OpenAI-compatible endpoints
JSON_SCHEMANative structured output / response_formatOpenAI gpt-4o series with 100% schema compliance guarantee
MD_JSONAsk for JSON inside a Markdown code blockModels that don't support tool calls; local models via Ollama
TOOLS_STRICTTool call with additionalProperties:falseWhen strict schema adherence is required on OpenAI

Multi-provider support

One of Instructor's biggest practical wins is that the same response_model pattern works across every supported provider. Switching from OpenAI to Anthropic Claude or Google Gemini is a one-line change — just swap the client.

Switching providerspython
import instructor
import anthropic
from pydantic import BaseModel

# Patch Anthropic's client instead
client = instructor.from_anthropic(anthropic.Anthropic())

class Summary(BaseModel):
    headline: str
    key_points: list[str]
    sentiment: str

summary = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    response_model=Summary,
    messages=[{"role": "user", "content": "Summarise this earnings call..."}],
)

print(summary.headline)
print(summary.key_points)

The unified instructor.from_provider() helper accepts provider strings like "openai/gpt-4o-mini" or "anthropic/claude-3-5-haiku-latest", removing even the need to import the underlying SDK directly in most cases.

  • OpenAI — GPT-4o, GPT-4o-mini, o1, o3 series; TOOLS, JSON_SCHEMA, TOOLS_STRICT modes
  • Anthropic — Claude 3 and 3.5 family; TOOLS mode (tool use)
  • Google Gemini — via instructor.from_gemini()
  • Mistral, Cohere, Groq, Fireworks — all OpenAI-compatible
  • Ollama / local models — via MD_JSON or JSON mode for models without native tool support
  • DeepSeek, Together AI, Azure OpenAI — full support with OpenAI-compatible adapter

Instructor vs. native structured outputs

Since mid-2024, OpenAI and Anthropic both offer native structured output features — so do you still need Instructor? The answer depends on what you need.

Native structured outputs win on simplicity and cost for projects that use only one provider and have simple schemas. Instructor wins on portability, complex validation logic, and model-agnostic code. In practice, Instructor can use the native JSON_SCHEMA mode under the hood when the provider supports it, so the two approaches are not mutually exclusive — you get native compliance plus Instructor's retry and multi-provider layer on top.

Common pitfalls and best practices

Instructor makes structured extraction reliable, but there are a few patterns that trip people up when starting out.

Pitfall 1: Overly complex schemas

The more deeply nested your Pydantic model, the harder the extraction task is for the model. A schema with seven levels of nesting and forty fields will fail more often than a flat one with five fields — and each failure costs extra tokens in the retry. Start with the minimal schema you actually need, then grow it.

Pitfall 2: Infinite retry loops

By default, Instructor retries up to 3 times. If you set max_retries very high on a model that consistently misunderstands your schema, you will burn tokens without success. If retries are regularly failing, the fix is usually a clearer schema or a better model — not more retries.

Pitfall 3: Pydantic error URL token waste

Pydantic automatically appends a documentation URL to every error message. When that error is sent back to the LLM as retry context, the URL consumes tokens without helping the model. Call instructor.disable_pydantic_error_url() at startup to strip these URLs from retry messages.

Strip Pydantic error URLs from retriespython
import instructor

# Call once before you create any clients
instructor.disable_pydantic_error_url()

client = instructor.from_openai(OpenAI())

Pitfall 4: Using Optional too liberally

Making every field Optional[str] to avoid validation failures means your model instance may be full of None values. Prefer required fields with sensible constraints, and use Optional only where the information is genuinely absent in the source text. This forces the LLM to extract only what is present and makes downstream code cleaner.

Going deeper

Once you are comfortable with basic extraction, Instructor has several advanced features that unlock more powerful patterns.

Streaming partial objects

For long extractions, Instructor supports streaming partial Pydantic objects via create_partial(). Your UI can start rendering results while the model is still generating, rather than waiting for the entire response.

Streaming partial responsepython
from instructor import Partial

# Stream a list of products as they are generated
for partial_product in client.chat.completions.create_partial(
    model="gpt-4o-mini",
    response_model=Iterable[Product],
    messages=[{"role": "user", "content": "List 5 popular Python web frameworks"}],
):
    print(partial_product)  # prints as each item completes

LLM-powered validators

Pydantic validators are just Python functions, which means you can call the LLM inside a validator to check semantic correctness — for example, verifying that a generated summary is factually grounded in the source text. Instructor's llm_validator helper makes this pattern concise.

LLM-powered field validationpython
from instructor import llm_validator
from pydantic import field_validator, BaseModel
from typing import Annotated

class QuestionAnswer(BaseModel):
    question: str
    answer: Annotated[
        str,
        llm_validator(
            statement="The answer must not include any medical advice.",
            client=client,
            model="gpt-4o-mini",
        ),
    ]

Hooks and observability

Instructor exposes on_parse_error and completion hooks so you can log every retry, track token usage, or forward traces to an observability platform like Langfuse or Weights & Biases. You can access the full attempt history from the InstructorRetryException if all retries fail.

Instructor vs. PydanticAI

Instructor and PydanticAI both give you Pydantic-validated outputs, but they sit at different levels of the stack. Instructor is a thin extraction layer — it patches an LLM client and returns a model instance. It has no concept of agents, tool execution, or multi-step reasoning. PydanticAI is a full agent framework with tools, dependency injection, and an agent loop. Use Instructor when you need reliable structured extraction from a single LLM call; use PydanticAI when you are building a multi-step agent that calls tools and produces a final typed result.

FAQ

Does Instructor work with local models like Llama or Mistral running on Ollama?

Yes. Point Instructor at an OpenAI-compatible endpoint (Ollama exposes one at http://localhost:11434/v1) and use MD_JSON mode for models that don't support tool calls, or TOOLS mode for models that do. The retry logic works identically regardless of the backend.

How does Instructor handle a response that fails validation on every retry?

After exhausting max_retries attempts, Instructor raises an InstructorRetryException. You can catch it to inspect the full attempt history — including each failed response and the corresponding Pydantic error — and decide whether to fall back to a default value, escalate to a stronger model, or surface the failure upstream.

Is Instructor just for Python, or are there other language clients?

Instructor has official ports for TypeScript/JavaScript (@instructor-ai/instructor, using Zod schemas), Go, Ruby, Elixir, and Rust. The Python library is the most mature, but the TypeScript version closely follows the same response_model API.

Does using Instructor cost extra tokens compared to a plain LLM call?

A successful single-pass call adds only the JSON schema to the prompt, which is typically a few hundred tokens. The real token cost comes from retries: each failed attempt sends the previous bad response plus the validation error back to the model. Keeping schemas simple and using capable models minimises retry rates and token overhead.

Can I use Instructor with OpenAI's native structured output mode to get the best of both?

Yes. Pass mode=instructor.Mode.JSON_SCHEMA when patching the client to use OpenAI's constrained-decoding JSON schema mode. You get OpenAI's 100% schema compliance guarantee plus Instructor's retry, streaming, and multi-provider abstraction on top.

What is the difference between Instructor and LangChain's output parsers?

LangChain output parsers are part of a larger framework with chains, retrievers, and agent abstractions baked in. Instructor is a standalone micro-library focused exclusively on the extraction layer — it patches any LLM client directly without requiring you to adopt a broader framework. This makes it easier to add to an existing codebase without restructuring anything.

Further reading