LMQL

A query language that blends Python control flow with LLM prompts and output constraints

Overview

LMQL is a programming language for large language models built as a superset of Python. Instead of stitching prompts together with string templates, you write code that reads like normal Python while top-level strings are sent to a model, and template variables such as [GREETINGS] are filled in by the LLM during execution.

It is aimed at developers who need more than a single prompt: people building multi-step generation, structured output, or agent-style logic where the program decides what to ask the model next. It sits in the prompt-programming corner of LLM orchestration, alongside templating and chaining tools, but pushes LLM calls down to the language level.

A key idea is the where keyword, which lets you attach constraints to generated text such as stopping phrases, length limits, or data types. LMQL also offers several decoding strategies (argmax, sample, beam search, best_k) and works with OpenAI, Azure OpenAI, and Hugging Face Transformers models.

What it does

Python-based syntax: queries are written in a superset of Python, so classes, variable captures, and control flow all work natively
Output constraints via the where keyword: limit length, enforce stopping phrases, character-level rules, and data types using logit masking
Multiple decoding algorithms including argmax, sample, beam search, and best_k
Sync and async APIs that can run many queries in parallel with cross-query batching
Multi-model support for OpenAI API, Azure OpenAI, and Hugging Face Transformers
Integrations with LangChain and LlamaIndex, plus a browser-based Playground IDE and a VS Code extension

Getting started

LMQL needs Python 3.10. Install it with pip, then launch the Playground IDE or write a query in Python.

Install LMQL

Install the latest release with pip. Python 3.10 must be available.

bashbash

pip install lmql

Add local GPU support (optional)

To run models on a local GPU, install in an environment with a GPU-enabled PyTorch >= 1.11 and use the hf extra.

bashbash

pip install lmql[hf]

Launch the Playground IDE

After installing, start the interactive Playground to write and run queries in the browser. This requires Node.js to be installed.

bashbash

lmql playground

Write a query

An LMQL program reads like Python; top-level strings are sent to the model and bracketed variables are completed by it. The where clause constrains the output.

pythonpython

"Greet LMQL:[GREETINGS]\n" where stops_at(GREETINGS, ".") and not "\n" in GREETINGS

"To summarize:[SUMMARY]"

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Scripting multi-step generation where program logic decides the next prompt based on earlier model output
Producing structured or schema-safe output (for example JSON) by constraining the model with the where clause
Running hundreds of queries in parallel with the async API and cross-query batching
Prototyping prompting strategies interactively in the Playground IDE before wiring them into a LangChain or LlamaIndex stack

How LMQL compares

LMQL alongside other open-source prompt programming tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
DSPy	★ 35.8k	A Stanford framework for programming language models with composable modules and automatic prompt optimization instead of hand-written prompts.
ell	★ 5.9k	A Python library that treats prompts as versioned functions, with tooling to track, visualize, and iterate on them as code.
GEPA	★ 5.5k	A reflective, evolutionary optimizer that improves prompts and other text components of a system using language-model feedback.
LMQL	★ 4.2k	A query language that blends Python control flow with LLM prompts and output constraints
AdalFlow	★ 4.2k	A PyTorch-like library for building and auto-optimizing LLM pipelines, tuning prompts across the components of a task.
TextGrad	★ 3.6k	A library that optimizes prompts and other text variables using textual gradients, applying a backpropagation-like loop driven by LLM feedback.
Mirascope	★ 1.5k	A lightweight Python toolkit for writing LLM calls as typed functions with prompt templates, chaining, and a single interface across providers.

// Overview

// What it does

// Getting started

Install LMQL

Add local GPU support (optional)

Launch the Playground IDE

Write a query

// When to use it

// How LMQL compares

Overview

What it does

Getting started

When to use it

How LMQL compares