Overview
BAML ("Basically a Made-up Language") is a small domain-specific language for writing LLM prompts. Instead of building free-form strings, you define each prompt as a function with typed inputs and a typed return value. Its Rust compiler then generates a client library (the baml_client) that you call from your own code, so model output comes back as structured, type-safe data rather than raw text you have to parse by hand.
It is aimed at developers building AI workflows and agents who want predictable, structured output across different models. You only write the prompts in BAML; the rest of your application stays in the language you already use. BAML provides quickstarts for Python, TypeScript, Ruby, and Go, plus a REST API path for other languages.
Within the structured-output and LLM-orchestration space, BAML focuses on schema engineering. It includes streaming, retries, fallbacks, and model rotations, and its schema-aligned parsing (SAP) approach handles flexible model responses even when a model does not support native tool-calling APIs.
What it does
- Define prompts as functions with typed parameters and return types, compiled into a generated baml_client
- Call the same BAML functions from Python, TypeScript, Ruby, Go, or any language via a REST API
- Schema-aligned parsing (SAP) for reliable structured output and tool-calling even with models that lack native tool-call APIs
- Type-safe streaming, including partial results during the stream and a final typed response
- Switch between many models by changing the client line, with retry policies, fallbacks, and round-robin model rotations
- IDE tooling for VS Code and JetBrains to visualize prompts, inspect the raw API request, and run tests in the playground
Getting started
Install the BAML package for your language, initialize a project, write a function in a .baml file, then generate the client and call it from your code. The example below uses Python.
Install and initialize
Install the Python package and scaffold a BAML project. This creates a baml_src directory for your .baml files.
pip install baml-py
baml-cli initDefine a function in BAML
In your .baml file, declare a function with typed inputs and a typed return value, the model it uses, and its prompt.
function ChatAgent(message: Message[], tone: "happy" | "sad") -> StopTool | ReplyTool {
client "openai/gpt-4o-mini"
prompt #"
Be a {{ tone }} bot.
{{ ctx.output_format }}
{% for m in message %}
{{ _.role(m.role) }}
{{ m.content }}
{% endfor %}
"#
}Generate the client
Run generate to produce the baml_client code that lets you call your BAML functions. With the VS Code extension installed, this also runs automatically on save.
baml-cli generateCall it from your code
Import the generated client and call your function. The return value is typed, so you can branch on the result.
from baml_client import b
from baml_client.types import Message, StopTool
messages = [Message(role="assistant", content="How can I help?")]
tool = b.ChatAgent(messages, "happy")
if isinstance(tool, StopTool):
print("Goodbye!")Commands and code are distilled from the project's own documentation — always check the official repo for the latest.
When to use it
- Extracting structured data, such as fields from a resume or document, into typed objects instead of parsing raw JSON yourself
- Building chat agents and multi-step workflows as while loops that call chained BAML functions with shared state
- Getting reliable tool-calling and structured output from models that do not support native tool-call APIs
- Adding retries, fallbacks, and model rotations across many providers without changing your application code
How BAML compares
BAML alongside other open-source structured output tools AI/TLDR tracks, ranked by GitHub stars.
| Tool | Stars | What it does |
|---|---|---|
| Guidance | ★ 21.5k | A programming model that interleaves generation, prompting, and control logic to constrain output and enforce formats like JSON or regex patterns. |
| Outlines | ★ 14k | A library for structured generation that constrains an LLM's token output to match a JSON schema, regex, or grammar so the result is always valid. |
| Instructor | ★ 13.2k | A library that wraps an LLM client to return data validated against a schema, retrying automatically on invalid output, with SDKs in several languages. |
| BAML | ★ 8.4k | A prompting language that turns LLM prompts into typed functions with reliable structured output |
| Marvin | ★ 6.2k | A Python toolkit from Prefect for turning LLM calls into typed functions that extract, classify, and cast text into structured Python objects. |
| LM Format Enforcer | ★ 2k | A library that enforces an output format such as JSON schema or regex by filtering the tokens an LLM is allowed to generate at each step. |
| XGrammar | ★ 1.8k | A fast, portable engine for grammar-constrained decoding that guarantees LLM output follows a given structure, used inside many inference servers. |