AI/TLDR

BAML

A prompting language that turns LLM prompts into typed functions with reliable structured output

Overview

BAML ("Basically a Made-up Language") is a small domain-specific language for writing LLM prompts. Instead of building free-form strings, you define each prompt as a function with typed inputs and a typed return value. Its Rust compiler then generates a client library (the baml_client) that you call from your own code, so model output comes back as structured, type-safe data rather than raw text you have to parse by hand.

It is aimed at developers building AI workflows and agents who want predictable, structured output across different models. You only write the prompts in BAML; the rest of your application stays in the language you already use. BAML provides quickstarts for Python, TypeScript, Ruby, and Go, plus a REST API path for other languages.

Within the structured-output and LLM-orchestration space, BAML focuses on schema engineering. It includes streaming, retries, fallbacks, and model rotations, and its schema-aligned parsing (SAP) approach handles flexible model responses even when a model does not support native tool-calling APIs.

What it does

  • Define prompts as functions with typed parameters and return types, compiled into a generated baml_client
  • Call the same BAML functions from Python, TypeScript, Ruby, Go, or any language via a REST API
  • Schema-aligned parsing (SAP) for reliable structured output and tool-calling even with models that lack native tool-call APIs
  • Type-safe streaming, including partial results during the stream and a final typed response
  • Switch between many models by changing the client line, with retry policies, fallbacks, and round-robin model rotations
  • IDE tooling for VS Code and JetBrains to visualize prompts, inspect the raw API request, and run tests in the playground

Getting started

Install the BAML package for your language, initialize a project, write a function in a .baml file, then generate the client and call it from your code. The example below uses Python.

Install and initialize

Install the Python package and scaffold a BAML project. This creates a baml_src directory for your .baml files.

bashbash
pip install baml-py
baml-cli init

Define a function in BAML

In your .baml file, declare a function with typed inputs and a typed return value, the model it uses, and its prompt.

rustrust
function ChatAgent(message: Message[], tone: "happy" | "sad") -> StopTool | ReplyTool {
    client "openai/gpt-4o-mini"

    prompt #"
        Be a {{ tone }} bot.

        {{ ctx.output_format }}

        {% for m in message %}
        {{ _.role(m.role) }}
        {{ m.content }}
        {% endfor %}
    "#
}

Generate the client

Run generate to produce the baml_client code that lets you call your BAML functions. With the VS Code extension installed, this also runs automatically on save.

bashbash
baml-cli generate

Call it from your code

Import the generated client and call your function. The return value is typed, so you can branch on the result.

pythonpython
from baml_client import b
from baml_client.types import Message, StopTool

messages = [Message(role="assistant", content="How can I help?")]
tool = b.ChatAgent(messages, "happy")
if isinstance(tool, StopTool):
    print("Goodbye!")

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Extracting structured data, such as fields from a resume or document, into typed objects instead of parsing raw JSON yourself
  • Building chat agents and multi-step workflows as while loops that call chained BAML functions with shared state
  • Getting reliable tool-calling and structured output from models that do not support native tool-call APIs
  • Adding retries, fallbacks, and model rotations across many providers without changing your application code

How BAML compares

BAML alongside other open-source structured output tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
Guidance★ 21.5kA programming model that interleaves generation, prompting, and control logic to constrain output and enforce formats like JSON or regex patterns.
Outlines★ 14kA library for structured generation that constrains an LLM's token output to match a JSON schema, regex, or grammar so the result is always valid.
Instructor★ 13.2kA library that wraps an LLM client to return data validated against a schema, retrying automatically on invalid output, with SDKs in several languages.
BAML★ 8.4kA prompting language that turns LLM prompts into typed functions with reliable structured output
Marvin★ 6.2kA Python toolkit from Prefect for turning LLM calls into typed functions that extract, classify, and cast text into structured Python objects.
LM Format Enforcer★ 2kA library that enforces an output format such as JSON schema or regex by filtering the tokens an LLM is allowed to generate at each step.
XGrammar★ 1.8kA fast, portable engine for grammar-constrained decoding that guarantees LLM output follows a given structure, used inside many inference servers.