BAML

A prompting language that turns LLM prompts into typed functions with reliable structured output

github.com/BoundaryML/baml★ 8.4k boundaryml.com

Overview

BAML ("Basically a Made-up Language") is a small domain-specific language for writing LLM prompts. Instead of building free-form strings, you define each prompt as a function with typed inputs and a typed return value. Its Rust compiler then generates a client library (the baml_client) that you call from your own code, so model output comes back as structured, type-safe data rather than raw text you have to parse by hand.

It is aimed at developers building AI workflows and agents who want predictable, structured output across different models. You only write the prompts in BAML; the rest of your application stays in the language you already use. BAML provides quickstarts for Python, TypeScript, Ruby, and Go, plus a REST API path for other languages.

Within the structured-output and LLM-orchestration space, BAML focuses on schema engineering. It includes streaming, retries, fallbacks, and model rotations, and its schema-aligned parsing (SAP) approach handles flexible model responses even when a model does not support native tool-calling APIs.

What it does

Define prompts as functions with typed parameters and return types, compiled into a generated baml_client
Call the same BAML functions from Python, TypeScript, Ruby, Go, or any language via a REST API
Schema-aligned parsing (SAP) for reliable structured output and tool-calling even with models that lack native tool-call APIs
Type-safe streaming, including partial results during the stream and a final typed response
Switch between many models by changing the client line, with retry policies, fallbacks, and round-robin model rotations
IDE tooling for VS Code and JetBrains to visualize prompts, inspect the raw API request, and run tests in the playground

Getting started

Install the BAML package for your language, initialize a project, write a function in a .baml file, then generate the client and call it from your code. The example below uses Python.

Install and initialize

Install the Python package and scaffold a BAML project. This creates a baml_src directory for your .baml files.

bashbash

pip install baml-py
baml-cli init

Define a function in BAML

In your .baml file, declare a function with typed inputs and a typed return value, the model it uses, and its prompt.

rustrust

function ChatAgent(message: Message[], tone: "happy" | "sad") -> StopTool | ReplyTool {
    client "openai/gpt-4o-mini"

    prompt #"
        Be a {{ tone }} bot.

        {{ ctx.output_format }}

        {% for m in message %}
        {{ _.role(m.role) }}
        {{ m.content }}
        {% endfor %}
    "#
}

Generate the client

Run generate to produce the baml_client code that lets you call your BAML functions. With the VS Code extension installed, this also runs automatically on save.

bashbash

baml-cli generate

Call it from your code

Import the generated client and call your function. The return value is typed, so you can branch on the result.

pythonpython

from baml_client import b
from baml_client.types import Message, StopTool

messages = [Message(role="assistant", content="How can I help?")]
tool = b.ChatAgent(messages, "happy")
if isinstance(tool, StopTool):
    print("Goodbye!")

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Extracting structured data, such as fields from a resume or document, into typed objects instead of parsing raw JSON yourself
Building chat agents and multi-step workflows as while loops that call chained BAML functions with shared state
Getting reliable tool-calling and structured output from models that do not support native tool-call APIs
Adding retries, fallbacks, and model rotations across many providers without changing your application code

How BAML compares

BAML alongside other open-source structured output tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Guidance	★ 21.5k	A programming model that interleaves generation, prompting, and control logic to constrain output and enforce formats like JSON or regex patterns.
Outlines	★ 14k	A library for structured generation that constrains an LLM's token output to match a JSON schema, regex, or grammar so the result is always valid.
Instructor	★ 13.2k	A library that wraps an LLM client to return data validated against a schema, retrying automatically on invalid output, with SDKs in several languages.
BAML	★ 8.4k	A prompting language that turns LLM prompts into typed functions with reliable structured output
Marvin	★ 6.2k	A Python toolkit from Prefect for turning LLM calls into typed functions that extract, classify, and cast text into structured Python objects.
LM Format Enforcer	★ 2k	A library that enforces an output format such as JSON schema or regex by filtering the tokens an LLM is allowed to generate at each step.
XGrammar	★ 1.8k	A fast, portable engine for grammar-constrained decoding that guarantees LLM output follows a given structure, used inside many inference servers.

// Overview

// What it does

// Getting started