NeMo Guardrails

Add programmable rails to LLM chat apps to keep them safe and on-topic

github.com/NVIDIA-NeMo/Guardrails★ 6.5k docs.nvidia.com/nemo/guardrails/latest/index.html

Overview

NeMo Guardrails is an open-source toolkit from NVIDIA for adding programmable guardrails (or "rails") to LLM-based conversational applications. Rails are specific ways of controlling a model's behavior, such as keeping it off certain topics, following a predefined dialog path, or moderating its output before it reaches the user.

It sits between your application code and the LLM, so calling it is similar to calling the model directly. You load a guardrails configuration, create an LLMRails instance, and route your chat messages through its generate method. It works with multiple LLM providers and can also wrap LangChain chains.

It fits the guardrail-frameworks category by giving developers a structured way to enforce safety and dialog policies. Rails are grouped into input, dialog, retrieval, execution, and output types, letting you reject or alter content at each stage of a request.

What it does

Five rail types cover the full request flow: input, dialog, retrieval, execution, and output
Colang-based dialog rails steer the model along predefined conversational paths and standard procedures
Protection against common LLM vulnerabilities such as jailbreaks and prompt injections
Works with multiple LLMs, including OpenAI GPT-3.5, GPT-4, LLaMa-2, Falcon, Vicuna, and Mosaic
Optional LangChain integration to wrap a guardrails layer around existing chains
Async-first design with both sync and async versions of public methods (generate and generate_async)

Getting started

Install the package with pip, then load a guardrails configuration and route your chat messages through an LLMRails instance. Requires Python 3.10, 3.11, 3.12, or 3.13.

Install with pip

Install the nemoguardrails package from PyPI.

bashbash

pip install nemoguardrails

Load a config and generate a response

Load a guardrails configuration from a path, create an LLMRails instance, and call generate with chat-style messages.

pythonpython

from nemoguardrails import LLMRails, RailsConfig

# Load a guardrails configuration from the specified path.
config = RailsConfig.from_path("PATH/TO/CONFIG")
rails = LLMRails(config)

completion = rails.generate(
    messages=[{"role": "user", "content": "Hello world!"}]
)

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Question answering over documents (RAG) where you need fact-checking and output moderation
Domain-specific chatbots that must stay on topic and follow designed conversational flows
Adding a safety layer to a custom LLM endpoint for safer customer interaction
Wrapping a guardrails layer around existing LangChain chains

How NeMo Guardrails compares

NeMo Guardrails alongside other open-source guardrails & security tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Microsoft Presidio	★ 9.3k	A framework for detecting, redacting, masking, and anonymizing personal data (PII) in text, images, and structured data using NER models, regex, and rule-based recognizers.
Guardrails AI	★ 7k	A Python framework that wraps LLM calls with composable input/output validators (from the Guardrails Hub) to check structure, type, and safety risks before responses reach users.
NeMo Guardrails	★ 6.5k	Add programmable rails to LLM chat apps to keep them safe and on-topic
GLiNER	★ 3.3k	A small zero-shot named-entity recognition model that can extract arbitrary entity types from text and is widely used as a PII detection backend, including inside Presidio.
LLM Guard	★ 3.1k	A security toolkit from Protect AI with 35+ input and output scanners that sanitize prompts and responses for prompt injection, toxicity, PII leakage, and harmful content.
Rebuff	★ 1.5k	A prompt injection detector that combines heuristics, an LLM-based classifier, a vector store of past attacks, and canary tokens to catch attempts to subvert an LLM application.
Detoxify	★ 1.3k	Pretrained transformer models from Unitary that score text for toxicity, insults, threats, and hate speech, often used to moderate LLM inputs and outputs.
Vigil	★ 482	A Python library and REST API that scans LLM prompts and responses with YARA rules, transformer classifiers, and vector similarity to flag prompt injections and jailbreaks.

// Overview

// What it does

// Getting started