garak

Scan LLMs for prompt injection, jailbreaks, and data leakage from the command line

Overview

garak is a command-line tool from NVIDIA that checks whether a large language model can be pushed into behaving in ways you don't want. It runs a library of probes against a target model to test for issues like prompt injection, jailbreaks, data leakage, hallucination, misinformation, and toxic output, then reports which probes the model failed and how often.

If you've used network or security scanners like nmap or Metasploit, garak does a similar job but for LLMs and dialog systems. It combines static, dynamic, and adaptive probes, and works with many model backends including Hugging Face Hub, the OpenAI API, Replicate, AWS Bedrock, LiteLLM, local GGUF models via llama.cpp, and most REST-accessible endpoints.

As an evaluation framework, garak fits the red-teaming and safety-testing stage of building with LLMs. It's aimed at engineers and security teams who want repeatable, automated checks on a model's weaknesses rather than ad-hoc manual prompting.

What it does

100+ probes covering prompt injection, jailbreaks (such as DAN), data leakage, toxicity, misinformation, and hallucination
Works across many backends: Hugging Face Hub, OpenAI API, Replicate, AWS Bedrock, LiteLLM, GGUF/llama.cpp, and most REST endpoints
Runs all known probes by default, or lets you target a specific probe family or single probe (e.g. promptinject, lmrc.SlurUsage)
Detector-based scoring that marks undesirable responses as FAIL and reports the failure rate per probe
Detailed logging to garak.log and a per-run .jsonl file, plus an analysis script to surface the prompts that caused the most hits
Built-in probe listing with --list_probes to explore available attacks

Getting started

garak is a command-line tool developed on Linux and macOS. Install it from PyPI, then point it at a model to start scanning.

Install with pip

Grab the latest release from PyPI.

bashbash

python -m pip install -U garak

List available probes

See every probe garak can run before launching a scan.

bashbash

garak --list_probes

Scan an OpenAI model

Set your API key, then run the encoding probes against a target model. Use --target_type for the model family and --target_name for the exact model.

bashbash

export OPENAI_API_KEY="sk-123XXXXXXXXXXXX"
python3 -m garak --target_type openai --target_name gpt-5-nano --probes encoding

Test a Hugging Face model for a jailbreak

Load a model from the Hub and run a specific probe, such as the DAN 11.0 jailbreak.

bashbash

python3 -m garak --target_type huggingface --target_name gpt2 --probes dan.Dan_11_0

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Red-teaming a model before deployment to find prompt-injection and jailbreak weaknesses
Comparing how susceptible different model versions are to the same class of attack (e.g. encoding-based injection)
Adding automated safety and vulnerability checks to an LLM project's testing pipeline
Auditing a hosted or REST-accessible model for data leakage and toxic-output risks

How garak compares

garak alongside other open-source evaluation & red-teaming tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Strix	★ 26.1k	Strix runs autonomous AI agents that act like hackers, dynamically running your code to find vulnerabilities and validate them with real proof-of-concepts.
promptfoo	★ 22.4k	A developer-first CLI and library for testing and comparing prompts and models, with red-teaming probes for prompt injection, PII leaks, and other vulnerabilities.
OpenAI Evals	★ 18.7k	A framework and open registry for building and running evaluations of LLMs and LLM-based systems, including prompt chains and tool-using agents.
DeepEval	★ 16.3k	An open-source Python framework that tests LLM apps like unit tests, with 50+ metrics for RAG, agents, chatbots, and safety, and a Pytest integration for CI/CD.
Ragas	★ 14.4k	An evaluation toolkit focused on retrieval-augmented generation that scores answer faithfulness, context precision/recall, and relevancy, often without needing ground-truth labels.
Arize Phoenix	★ 10.2k	An open-source observability and evaluation tool for tracing LLM and agent behavior, running evals on traces, and troubleshooting issues in development and production.
garak	★ 8.2k	Scan LLMs for prompt injection, jailbreaks, and data leakage from the command line
Giskard	★ 5.4k	An open-source library for testing and scanning LLM and ML models for issues like hallucination, bias, and toxicity, including multi-turn agent testing and a vulnerability scanner.

// Overview

// What it does

// Getting started

Install with pip

List available probes

Scan an OpenAI model

Test a Hugging Face model for a jailbreak

// When to use it

// How garak compares

Overview

What it does

Getting started

When to use it

How garak compares