AI/TLDR

Outlines

Constrain any LLM to always produce valid structured output

Overview

Outlines is a Python library for structured generation. Instead of cleaning up an LLM's text after the fact with parsing and regex, it constrains the model's token output during generation so the result always matches the structure you asked for, whether that's a JSON schema, a Pydantic model, a regular expression, or a plain Python type.

It is aimed at developers who need reliable, machine-readable output from language models, for example to extract fields from text, classify inputs, or feed an LLM's result into downstream code without it breaking on malformed JSON. You pass the prompt and a desired output type, and Outlines guarantees the result conforms to it.

As a structured-output tool in the LLM orchestration space, Outlines runs the same code across several backends, including OpenAI, Ollama, vLLM, and Transformers, so you can switch models without rewriting your generation logic.

What it does

  • Guarantees valid structure during generation, not by post-hoc parsing or regex cleanup
  • Works across many model backends (OpenAI, Ollama, vLLM, Transformers, and more) with the same code
  • Accepts plain Python types as output specs: Literal for classification, int for numbers, Pydantic models for complex objects
  • Returns JSON you can validate directly with a Pydantic model's model_validate_json
  • Provider independence so you can change the underlying model without changing your code
  • Used in production by teams including NVIDIA, Cohere, Hugging Face, and vLLM

Getting started

Install Outlines, connect it to a model, then pass the prompt plus the output type you want.

Install outlines

Install the package from PyPI.

bashbash
pip install outlines

Connect to a model

Wrap a model with one of the integrations. This example uses a local Transformers model.

pythonpython
import outlines
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
model = outlines.from_transformers(
    AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
    AutoTokenizer.from_pretrained(MODEL_NAME)
)

Generate a simple structured output

Pass the prompt and a Python type. Outlines constrains the output to match it.

pythonpython
from typing import Literal

sentiment = model(
    "Analyze: 'This product completely changed my life!'",
    Literal["Positive", "Negative", "Neutral"]
)
print(sentiment)  # "Positive"

temperature = model("What's the boiling point of water in Celsius?", int)
print(temperature)  # 100

Generate a complex object

Define a Pydantic model as the output type, then validate the returned JSON against it.

pythonpython
from pydantic import BaseModel

class ProductReview(BaseModel):
    pros: list[str]
    cons: list[str]
    summary: str

review = model(
    "Review: The XPS 13 has great battery life and a stunning display, but it runs hot.",
    ProductReview,
    max_new_tokens=200,
)
review = ProductReview.model_validate_json(review)
print(review.summary)

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Turn free-form customer emails into structured support tickets with priority, category, and action items for automated routing
  • Classify text inputs into a fixed set of labels using a Literal type
  • Extract typed fields (numbers, lists, nested objects) from documents into a Pydantic model you can pass straight to downstream code
  • Get reliable JSON from a local or hosted LLM without writing defensive parsing for malformed output

How Outlines compares

Outlines alongside other open-source structured output tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
Guidance★ 21.5kA programming model that interleaves generation, prompting, and control logic to constrain output and enforce formats like JSON or regex patterns.
Outlines★ 14kConstrain any LLM to always produce valid structured output
Instructor★ 13.2kA library that wraps an LLM client to return data validated against a schema, retrying automatically on invalid output, with SDKs in several languages.
BAML★ 8.4kA domain-specific language for defining LLM functions with typed schemas, parsing flexible model output into reliable structured data across many languages.
Marvin★ 6.2kA Python toolkit from Prefect for turning LLM calls into typed functions that extract, classify, and cast text into structured Python objects.
LM Format Enforcer★ 2kA library that enforces an output format such as JSON schema or regex by filtering the tokens an LLM is allowed to generate at each step.
XGrammar★ 1.8kA fast, portable engine for grammar-constrained decoding that guarantees LLM output follows a given structure, used inside many inference servers.