Overview
Outlines is a Python library for structured generation. Instead of cleaning up an LLM's text after the fact with parsing and regex, it constrains the model's token output during generation so the result always matches the structure you asked for, whether that's a JSON schema, a Pydantic model, a regular expression, or a plain Python type.
It is aimed at developers who need reliable, machine-readable output from language models, for example to extract fields from text, classify inputs, or feed an LLM's result into downstream code without it breaking on malformed JSON. You pass the prompt and a desired output type, and Outlines guarantees the result conforms to it.
As a structured-output tool in the LLM orchestration space, Outlines runs the same code across several backends, including OpenAI, Ollama, vLLM, and Transformers, so you can switch models without rewriting your generation logic.
What it does
- Guarantees valid structure during generation, not by post-hoc parsing or regex cleanup
- Works across many model backends (OpenAI, Ollama, vLLM, Transformers, and more) with the same code
- Accepts plain Python types as output specs: Literal for classification, int for numbers, Pydantic models for complex objects
- Returns JSON you can validate directly with a Pydantic model's model_validate_json
- Provider independence so you can change the underlying model without changing your code
- Used in production by teams including NVIDIA, Cohere, Hugging Face, and vLLM
Getting started
Install Outlines, connect it to a model, then pass the prompt plus the output type you want.
Install outlines
Install the package from PyPI.
pip install outlinesConnect to a model
Wrap a model with one of the integrations. This example uses a local Transformers model.
import outlines
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
model = outlines.from_transformers(
AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
AutoTokenizer.from_pretrained(MODEL_NAME)
)Generate a simple structured output
Pass the prompt and a Python type. Outlines constrains the output to match it.
from typing import Literal
sentiment = model(
"Analyze: 'This product completely changed my life!'",
Literal["Positive", "Negative", "Neutral"]
)
print(sentiment) # "Positive"
temperature = model("What's the boiling point of water in Celsius?", int)
print(temperature) # 100Generate a complex object
Define a Pydantic model as the output type, then validate the returned JSON against it.
from pydantic import BaseModel
class ProductReview(BaseModel):
pros: list[str]
cons: list[str]
summary: str
review = model(
"Review: The XPS 13 has great battery life and a stunning display, but it runs hot.",
ProductReview,
max_new_tokens=200,
)
review = ProductReview.model_validate_json(review)
print(review.summary)Commands and code are distilled from the project's own documentation — always check the official repo for the latest.
When to use it
- Turn free-form customer emails into structured support tickets with priority, category, and action items for automated routing
- Classify text inputs into a fixed set of labels using a Literal type
- Extract typed fields (numbers, lists, nested objects) from documents into a Pydantic model you can pass straight to downstream code
- Get reliable JSON from a local or hosted LLM without writing defensive parsing for malformed output
How Outlines compares
Outlines alongside other open-source structured output tools AI/TLDR tracks, ranked by GitHub stars.
| Tool | Stars | What it does |
|---|---|---|
| Guidance | ★ 21.5k | A programming model that interleaves generation, prompting, and control logic to constrain output and enforce formats like JSON or regex patterns. |
| Outlines | ★ 14k | Constrain any LLM to always produce valid structured output |
| Instructor | ★ 13.2k | A library that wraps an LLM client to return data validated against a schema, retrying automatically on invalid output, with SDKs in several languages. |
| BAML | ★ 8.4k | A domain-specific language for defining LLM functions with typed schemas, parsing flexible model output into reliable structured data across many languages. |
| Marvin | ★ 6.2k | A Python toolkit from Prefect for turning LLM calls into typed functions that extract, classify, and cast text into structured Python objects. |
| LM Format Enforcer | ★ 2k | A library that enforces an output format such as JSON schema or regex by filtering the tokens an LLM is allowed to generate at each step. |
| XGrammar | ★ 1.8k | A fast, portable engine for grammar-constrained decoding that guarantees LLM output follows a given structure, used inside many inference servers. |