AI/TLDR

garak

Scan LLMs for prompt injection, jailbreaks, and data leakage from the command line

Overview

garak is a command-line tool from NVIDIA that checks whether a large language model can be pushed into behaving in ways you don't want. It runs a library of probes against a target model to test for issues like prompt injection, jailbreaks, data leakage, hallucination, misinformation, and toxic output, then reports which probes the model failed and how often.

If you've used network or security scanners like nmap or Metasploit, garak does a similar job but for LLMs and dialog systems. It combines static, dynamic, and adaptive probes, and works with many model backends including Hugging Face Hub, the OpenAI API, Replicate, AWS Bedrock, LiteLLM, local GGUF models via llama.cpp, and most REST-accessible endpoints.

As an evaluation framework, garak fits the red-teaming and safety-testing stage of building with LLMs. It's aimed at engineers and security teams who want repeatable, automated checks on a model's weaknesses rather than ad-hoc manual prompting.

What it does

  • 100+ probes covering prompt injection, jailbreaks (such as DAN), data leakage, toxicity, misinformation, and hallucination
  • Works across many backends: Hugging Face Hub, OpenAI API, Replicate, AWS Bedrock, LiteLLM, GGUF/llama.cpp, and most REST endpoints
  • Runs all known probes by default, or lets you target a specific probe family or single probe (e.g. promptinject, lmrc.SlurUsage)
  • Detector-based scoring that marks undesirable responses as FAIL and reports the failure rate per probe
  • Detailed logging to garak.log and a per-run .jsonl file, plus an analysis script to surface the prompts that caused the most hits
  • Built-in probe listing with --list_probes to explore available attacks

Getting started

garak is a command-line tool developed on Linux and macOS. Install it from PyPI, then point it at a model to start scanning.

Install with pip

Grab the latest release from PyPI.

bashbash
python -m pip install -U garak

List available probes

See every probe garak can run before launching a scan.

bashbash
garak --list_probes

Scan an OpenAI model

Set your API key, then run the encoding probes against a target model. Use --target_type for the model family and --target_name for the exact model.

bashbash
export OPENAI_API_KEY="sk-123XXXXXXXXXXXX"
python3 -m garak --target_type openai --target_name gpt-5-nano --probes encoding

Test a Hugging Face model for a jailbreak

Load a model from the Hub and run a specific probe, such as the DAN 11.0 jailbreak.

bashbash
python3 -m garak --target_type huggingface --target_name gpt2 --probes dan.Dan_11_0

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Red-teaming a model before deployment to find prompt-injection and jailbreak weaknesses
  • Comparing how susceptible different model versions are to the same class of attack (e.g. encoding-based injection)
  • Adding automated safety and vulnerability checks to an LLM project's testing pipeline
  • Auditing a hosted or REST-accessible model for data leakage and toxic-output risks

How garak compares

garak alongside other open-source evaluation & red-teaming tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
Strix★ 26.1kStrix runs autonomous AI agents that act like hackers, dynamically running your code to find vulnerabilities and validate them with real proof-of-concepts.
promptfoo★ 22.4kA developer-first CLI and library for testing and comparing prompts and models, with red-teaming probes for prompt injection, PII leaks, and other vulnerabilities.
OpenAI Evals★ 18.7kA framework and open registry for building and running evaluations of LLMs and LLM-based systems, including prompt chains and tool-using agents.
DeepEval★ 16.3kAn open-source Python framework that tests LLM apps like unit tests, with 50+ metrics for RAG, agents, chatbots, and safety, and a Pytest integration for CI/CD.
Ragas★ 14.4kAn evaluation toolkit focused on retrieval-augmented generation that scores answer faithfulness, context precision/recall, and relevancy, often without needing ground-truth labels.
Arize Phoenix★ 10.2kAn open-source observability and evaluation tool for tracing LLM and agent behavior, running evals on traces, and troubleshooting issues in development and production.
garak★ 8.2kScan LLMs for prompt injection, jailbreaks, and data leakage from the command line
Giskard★ 5.4kAn open-source library for testing and scanning LLM and ML models for issues like hallucination, bias, and toxicity, including multi-turn agent testing and a vulnerability scanner.