AI/TLDR

NeMo Guardrails

Add programmable rails to LLM chat apps to keep them safe and on-topic

Overview

NeMo Guardrails is an open-source toolkit from NVIDIA for adding programmable guardrails (or "rails") to LLM-based conversational applications. Rails are specific ways of controlling a model's behavior, such as keeping it off certain topics, following a predefined dialog path, or moderating its output before it reaches the user.

It sits between your application code and the LLM, so calling it is similar to calling the model directly. You load a guardrails configuration, create an LLMRails instance, and route your chat messages through its generate method. It works with multiple LLM providers and can also wrap LangChain chains.

It fits the guardrail-frameworks category by giving developers a structured way to enforce safety and dialog policies. Rails are grouped into input, dialog, retrieval, execution, and output types, letting you reject or alter content at each stage of a request.

What it does

  • Five rail types cover the full request flow: input, dialog, retrieval, execution, and output
  • Colang-based dialog rails steer the model along predefined conversational paths and standard procedures
  • Protection against common LLM vulnerabilities such as jailbreaks and prompt injections
  • Works with multiple LLMs, including OpenAI GPT-3.5, GPT-4, LLaMa-2, Falcon, Vicuna, and Mosaic
  • Optional LangChain integration to wrap a guardrails layer around existing chains
  • Async-first design with both sync and async versions of public methods (generate and generate_async)

Getting started

Install the package with pip, then load a guardrails configuration and route your chat messages through an LLMRails instance. Requires Python 3.10, 3.11, 3.12, or 3.13.

Install with pip

Install the nemoguardrails package from PyPI.

bashbash
pip install nemoguardrails

Load a config and generate a response

Load a guardrails configuration from a path, create an LLMRails instance, and call generate with chat-style messages.

pythonpython
from nemoguardrails import LLMRails, RailsConfig

# Load a guardrails configuration from the specified path.
config = RailsConfig.from_path("PATH/TO/CONFIG")
rails = LLMRails(config)

completion = rails.generate(
    messages=[{"role": "user", "content": "Hello world!"}]
)

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Question answering over documents (RAG) where you need fact-checking and output moderation
  • Domain-specific chatbots that must stay on topic and follow designed conversational flows
  • Adding a safety layer to a custom LLM endpoint for safer customer interaction
  • Wrapping a guardrails layer around existing LangChain chains

How NeMo Guardrails compares

NeMo Guardrails alongside other open-source guardrails & security tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
Microsoft Presidio★ 9.3kA framework for detecting, redacting, masking, and anonymizing personal data (PII) in text, images, and structured data using NER models, regex, and rule-based recognizers.
Guardrails AI★ 7kA Python framework that wraps LLM calls with composable input/output validators (from the Guardrails Hub) to check structure, type, and safety risks before responses reach users.
NeMo Guardrails★ 6.5kAdd programmable rails to LLM chat apps to keep them safe and on-topic
GLiNER★ 3.3kA small zero-shot named-entity recognition model that can extract arbitrary entity types from text and is widely used as a PII detection backend, including inside Presidio.
LLM Guard★ 3.1kA security toolkit from Protect AI with 35+ input and output scanners that sanitize prompts and responses for prompt injection, toxicity, PII leakage, and harmful content.
Rebuff★ 1.5kA prompt injection detector that combines heuristics, an LLM-based classifier, a vector store of past attacks, and canary tokens to catch attempts to subvert an LLM application.
Detoxify★ 1.3kPretrained transformer models from Unitary that score text for toxicity, insults, threats, and hate speech, often used to moderate LLM inputs and outputs.
Vigil★ 482A Python library and REST API that scans LLM prompts and responses with YARA rules, transformer classifiers, and vector similarity to flag prompt injections and jailbreaks.