Overview
LM Format Enforcer is a Python library that makes a language model stick to a required output format, such as a JSON Schema or a regular expression. Instead of relying on prompt wording and hoping the model complies, it filters the tokens the model is allowed to produce at each step, so the result always matches the format you asked for.
It is aimed at developers who need reliable structured output from local or self-hosted models and are tired of parsing and re-trying malformed responses. It works with any Python language model and tokenizer, and already integrates with transformers, LangChain, LlamaIndex, llama.cpp, vLLM, Haystack, NVIDIA TensorRT-LLM, and ExLlamaV2.
As a structured-output tool, it slots into your existing generation pipeline rather than replacing it. It still lets the model choose whitespace and field ordering within a JSON schema, which keeps the text natural and reduces hallucinations while guaranteeing the shape of the output.
What it does
- Enforces JSON Schema, schemaless JSON mode, and regular-expression formats
- Works with any Python language model and tokenizer; integrates with transformers, LangChain, LlamaIndex, llama.cpp, vLLM, Haystack, TensorRT-LLM, and ExLlamaV2
- Supports batched generation and beam search, filtering tokens per input or per beam
- Handles required and optional fields, plus nested objects, arrays, and dictionaries in JSON schemas
- Lets the model control whitespace and field ordering, reducing hallucinations
- Drops into existing pipelines without changing the high-level generation loop
Getting started
Install the package, then build a parser from your schema and pass it into your generation pipeline as a token filter. The example below uses Hugging Face transformers, straight from the README.
Install the package
Install from PyPI. For the transformers example you also need transformers, torch, and the model's runtime dependencies.
pip install lm-format-enforcerDefine your output schema
Describe the shape you want with a Pydantic model.
from pydantic import BaseModel
class AnswerFormat(BaseModel):
first_name: str
last_name: str
year_of_birth: int
num_seasons_in_nba: intBuild a token filter and run generation
Create a JsonSchemaParser, build a prefix function from it and your tokenizer, then pass that function to the transformers pipeline so the output matches the schema.
from lmformatenforcer import JsonSchemaParser
from lmformatenforcer.integrations.transformers import build_transformers_prefix_allowed_tokens_fn
from transformers import pipeline
hf_pipeline = pipeline('text-generation', model='TheBloke/Llama-2-7b-Chat-GPTQ', device_map='auto')
prompt = f'Here is information about Michael Jordan in the following json schema: {AnswerFormat.schema_json()} :\n'
parser = JsonSchemaParser(AnswerFormat.schema())
prefix_function = build_transformers_prefix_allowed_tokens_fn(hf_pipeline.tokenizer, parser)
output_dict = hf_pipeline(prompt, prefix_allowed_tokens_fn=prefix_function)
result = output_dict[0]['generated_text'][len(prompt):]
print(result)Commands and code are distilled from the project's own documentation — always check the official repo for the latest.
When to use it
- Extracting reliable JSON records from a local model for downstream parsing without retry loops
- Serving structured output from a vLLM OpenAI-compatible server using the lm-format-enforcer guided-decoding backend
- Constraining model output to a regular expression, for example a date, phone number, or enum value
- Producing valid responses for nested schemas with optional fields inside transformers, llama.cpp, or LlamaIndex pipelines
How LM Format Enforcer compares
LM Format Enforcer alongside other open-source structured output tools AI/TLDR tracks, ranked by GitHub stars.
| Tool | Stars | What it does |
|---|---|---|
| Guidance | ★ 21.5k | A programming model that interleaves generation, prompting, and control logic to constrain output and enforce formats like JSON or regex patterns. |
| Outlines | ★ 14k | A library for structured generation that constrains an LLM's token output to match a JSON schema, regex, or grammar so the result is always valid. |
| Instructor | ★ 13.2k | A library that wraps an LLM client to return data validated against a schema, retrying automatically on invalid output, with SDKs in several languages. |
| BAML | ★ 8.4k | A domain-specific language for defining LLM functions with typed schemas, parsing flexible model output into reliable structured data across many languages. |
| Marvin | ★ 6.2k | A Python toolkit from Prefect for turning LLM calls into typed functions that extract, classify, and cast text into structured Python objects. |
| LM Format Enforcer | ★ 2k | Force an LLM's output to match a JSON Schema or regex by filtering its tokens |
| XGrammar | ★ 1.8k | A fast, portable engine for grammar-constrained decoding that guarantees LLM output follows a given structure, used inside many inference servers. |