Overview
Guardrails AI is a Python framework for building reliable LLM applications. It runs Input and Output Guards around your model calls to detect, measure, and reduce specific risks, and it can also force model output into a defined structure.
It is aimed at developers who already call an LLM in Python and want a checkpoint before a response reaches a user. You pull pre-built validators from the Guardrails Hub, combine them into a Guard, and let that Guard validate text or generate structured data.
As a guardrail framework, it sits between your application code and the model. Validators cover things like regex matching, competitor mentions, and toxic language, and each Guard decides what to do when a check fails.
What it does
- Pull pre-built validators (regex match, competitor check, toxic language, and more) from the Guardrails Hub
- Combine multiple validators into a single Input or Output Guard
- Configurable failure handling per validator via OnFailAction (for example, raise an exception)
- Generate structured data from LLMs using a Pydantic BaseModel with Guard.for_pydantic
- Works with proprietary and open-source LLMs, using function calling or prompt-based schema injection
- Run Guardrails as a standalone Flask service with a REST API via guardrails start
Getting started
Install the package, configure the Hub CLI, then install a validator and build a Guard.
Install and configure
Install the package from PyPI and run the Hub configuration command.
pip install guardrails-ai
guardrails configureInstall a validator from the Hub
Pull a validator, such as regex match, from the Guardrails Hub.
guardrails hub install hub://guardrails/regex_matchCreate a Guard and validate
Build a Guard from the installed validator and run validation. Passing text returns normally; failing text raises an exception.
from guardrails import Guard, OnFailAction
from guardrails.hub import RegexMatch
guard = Guard().use(
RegexMatch, regex="\(?\d{3}\)?-? *\d{3}-? *-?\d{4}", on_fail=OnFailAction.EXCEPTION
)
guard.validate("123-456-7890") # Guardrail passes
try:
guard.validate("1234-789-0000") # Guardrail fails
except Exception as e:
print(e)Commands and code are distilled from the project's own documentation — always check the official repo for the latest.
When to use it
- Block or flag toxic language in model responses before they reach end users
- Enforce that an LLM does not mention named competitors in generated text
- Validate that model output matches an expected format, such as a phone number pattern
- Generate structured JSON data from an LLM that conforms to a Pydantic schema
How Guardrails AI compares
Guardrails AI alongside other open-source guardrails & security tools AI/TLDR tracks, ranked by GitHub stars.
| Tool | Stars | What it does |
|---|---|---|
| Microsoft Presidio | ★ 9.3k | A framework for detecting, redacting, masking, and anonymizing personal data (PII) in text, images, and structured data using NER models, regex, and rule-based recognizers. |
| Guardrails AI | ★ 7k | Validate LLM inputs and outputs with composable guards in Python |
| NeMo Guardrails | ★ 6.5k | NVIDIA's toolkit for adding programmable rails to LLM chat apps, using the Colang language to control dialog flow and block jailbreaks, prompt injection, and off-topic answers. |
| GLiNER | ★ 3.3k | A small zero-shot named-entity recognition model that can extract arbitrary entity types from text and is widely used as a PII detection backend, including inside Presidio. |
| LLM Guard | ★ 3.1k | A security toolkit from Protect AI with 35+ input and output scanners that sanitize prompts and responses for prompt injection, toxicity, PII leakage, and harmful content. |
| Rebuff | ★ 1.5k | A prompt injection detector that combines heuristics, an LLM-based classifier, a vector store of past attacks, and canary tokens to catch attempts to subvert an LLM application. |
| Detoxify | ★ 1.3k | Pretrained transformer models from Unitary that score text for toxicity, insults, threats, and hate speech, often used to moderate LLM inputs and outputs. |
| Vigil | ★ 482 | A Python library and REST API that scans LLM prompts and responses with YARA rules, transformer classifiers, and vector similarity to flag prompt injections and jailbreaks. |