Overview
garak is a command-line tool from NVIDIA that checks whether a large language model can be pushed into behaving in ways you don't want. It runs a library of probes against a target model to test for issues like prompt injection, jailbreaks, data leakage, hallucination, misinformation, and toxic output, then reports which probes the model failed and how often.
If you've used network or security scanners like nmap or Metasploit, garak does a similar job but for LLMs and dialog systems. It combines static, dynamic, and adaptive probes, and works with many model backends including Hugging Face Hub, the OpenAI API, Replicate, AWS Bedrock, LiteLLM, local GGUF models via llama.cpp, and most REST-accessible endpoints.
As an evaluation framework, garak fits the red-teaming and safety-testing stage of building with LLMs. It's aimed at engineers and security teams who want repeatable, automated checks on a model's weaknesses rather than ad-hoc manual prompting.
What it does
- 100+ probes covering prompt injection, jailbreaks (such as DAN), data leakage, toxicity, misinformation, and hallucination
- Works across many backends: Hugging Face Hub, OpenAI API, Replicate, AWS Bedrock, LiteLLM, GGUF/llama.cpp, and most REST endpoints
- Runs all known probes by default, or lets you target a specific probe family or single probe (e.g. promptinject, lmrc.SlurUsage)
- Detector-based scoring that marks undesirable responses as FAIL and reports the failure rate per probe
- Detailed logging to garak.log and a per-run .jsonl file, plus an analysis script to surface the prompts that caused the most hits
- Built-in probe listing with --list_probes to explore available attacks
Getting started
garak is a command-line tool developed on Linux and macOS. Install it from PyPI, then point it at a model to start scanning.
Install with pip
Grab the latest release from PyPI.
python -m pip install -U garakList available probes
See every probe garak can run before launching a scan.
garak --list_probesScan an OpenAI model
Set your API key, then run the encoding probes against a target model. Use --target_type for the model family and --target_name for the exact model.
export OPENAI_API_KEY="sk-123XXXXXXXXXXXX"
python3 -m garak --target_type openai --target_name gpt-5-nano --probes encodingTest a Hugging Face model for a jailbreak
Load a model from the Hub and run a specific probe, such as the DAN 11.0 jailbreak.
python3 -m garak --target_type huggingface --target_name gpt2 --probes dan.Dan_11_0Commands and code are distilled from the project's own documentation — always check the official repo for the latest.
When to use it
- Red-teaming a model before deployment to find prompt-injection and jailbreak weaknesses
- Comparing how susceptible different model versions are to the same class of attack (e.g. encoding-based injection)
- Adding automated safety and vulnerability checks to an LLM project's testing pipeline
- Auditing a hosted or REST-accessible model for data leakage and toxic-output risks
How garak compares
garak alongside other open-source evaluation & red-teaming tools AI/TLDR tracks, ranked by GitHub stars.
| Tool | Stars | What it does |
|---|---|---|
| Strix | ★ 26.1k | Strix runs autonomous AI agents that act like hackers, dynamically running your code to find vulnerabilities and validate them with real proof-of-concepts. |
| promptfoo | ★ 22.4k | A developer-first CLI and library for testing and comparing prompts and models, with red-teaming probes for prompt injection, PII leaks, and other vulnerabilities. |
| OpenAI Evals | ★ 18.7k | A framework and open registry for building and running evaluations of LLMs and LLM-based systems, including prompt chains and tool-using agents. |
| DeepEval | ★ 16.3k | An open-source Python framework that tests LLM apps like unit tests, with 50+ metrics for RAG, agents, chatbots, and safety, and a Pytest integration for CI/CD. |
| Ragas | ★ 14.4k | An evaluation toolkit focused on retrieval-augmented generation that scores answer faithfulness, context precision/recall, and relevancy, often without needing ground-truth labels. |
| Arize Phoenix | ★ 10.2k | An open-source observability and evaluation tool for tracing LLM and agent behavior, running evals on traces, and troubleshooting issues in development and production. |
| garak | ★ 8.2k | Scan LLMs for prompt injection, jailbreaks, and data leakage from the command line |
| Giskard | ★ 5.4k | An open-source library for testing and scanning LLM and ML models for issues like hallucination, bias, and toxicity, including multi-turn agent testing and a vulnerability scanner. |