Guardrails AI

Validate LLM inputs and outputs with composable guards in Python

github.com/guardrails-ai/guardrails★ 7k guardrailsai.com

Overview

Guardrails AI is a Python framework for building reliable LLM applications. It runs Input and Output Guards around your model calls to detect, measure, and reduce specific risks, and it can also force model output into a defined structure.

It is aimed at developers who already call an LLM in Python and want a checkpoint before a response reaches a user. You pull pre-built validators from the Guardrails Hub, combine them into a Guard, and let that Guard validate text or generate structured data.

As a guardrail framework, it sits between your application code and the model. Validators cover things like regex matching, competitor mentions, and toxic language, and each Guard decides what to do when a check fails.

What it does

Pull pre-built validators (regex match, competitor check, toxic language, and more) from the Guardrails Hub
Combine multiple validators into a single Input or Output Guard
Configurable failure handling per validator via OnFailAction (for example, raise an exception)
Generate structured data from LLMs using a Pydantic BaseModel with Guard.for_pydantic
Works with proprietary and open-source LLMs, using function calling or prompt-based schema injection
Run Guardrails as a standalone Flask service with a REST API via guardrails start

Getting started

Install the package, configure the Hub CLI, then install a validator and build a Guard.

Install and configure

Install the package from PyPI and run the Hub configuration command.

bashbash

pip install guardrails-ai
guardrails configure

Install a validator from the Hub

Pull a validator, such as regex match, from the Guardrails Hub.

bashbash

guardrails hub install hub://guardrails/regex_match

Create a Guard and validate

Build a Guard from the installed validator and run validation. Passing text returns normally; failing text raises an exception.

pythonpython

from guardrails import Guard, OnFailAction
from guardrails.hub import RegexMatch

guard = Guard().use(
    RegexMatch, regex="\(?\d{3}\)?-? *\d{3}-? *-?\d{4}", on_fail=OnFailAction.EXCEPTION
)

guard.validate("123-456-7890")  # Guardrail passes

try:
    guard.validate("1234-789-0000")  # Guardrail fails
except Exception as e:
    print(e)

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Block or flag toxic language in model responses before they reach end users
Enforce that an LLM does not mention named competitors in generated text
Validate that model output matches an expected format, such as a phone number pattern
Generate structured JSON data from an LLM that conforms to a Pydantic schema

How Guardrails AI compares

Guardrails AI alongside other open-source guardrails & security tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Microsoft Presidio	★ 9.3k	A framework for detecting, redacting, masking, and anonymizing personal data (PII) in text, images, and structured data using NER models, regex, and rule-based recognizers.
Guardrails AI	★ 7k	Validate LLM inputs and outputs with composable guards in Python
NeMo Guardrails	★ 6.5k	NVIDIA's toolkit for adding programmable rails to LLM chat apps, using the Colang language to control dialog flow and block jailbreaks, prompt injection, and off-topic answers.
GLiNER	★ 3.3k	A small zero-shot named-entity recognition model that can extract arbitrary entity types from text and is widely used as a PII detection backend, including inside Presidio.
LLM Guard	★ 3.1k	A security toolkit from Protect AI with 35+ input and output scanners that sanitize prompts and responses for prompt injection, toxicity, PII leakage, and harmful content.
Rebuff	★ 1.5k	A prompt injection detector that combines heuristics, an LLM-based classifier, a vector store of past attacks, and canary tokens to catch attempts to subvert an LLM application.
Detoxify	★ 1.3k	Pretrained transformer models from Unitary that score text for toxicity, insults, threats, and hate speech, often used to moderate LLM inputs and outputs.
Vigil	★ 482	A Python library and REST API that scans LLM prompts and responses with YARA rules, transformer classifiers, and vector similarity to flag prompt injections and jailbreaks.

// Overview

// What it does

// Getting started