Protect AI · 2024-03-20 · notable

Rebuff — Prompt Injection Detector

Item: Rebuff — Prompt Injection Detector
Rating: 3
Author: AI/TLDR

Self-hardening prompt injection detector. Uses heuristics, LLM analysis, and a vector database of known attacks. Learns from new attacks it encounters.

Rebuff prompt injection detector repository

Multi-layered prompt injection detection that learns from attacks.

Key specs

GitHub stars	800+
Detection layers	4

What is it?

Rebuff detects prompt injection attacks using four layers: heuristic analysis, LLM-based detection, vector similarity against known attacks, and a canary token system.

How does it work?

Pass user input through Rebuff before your LLM. It checks heuristics, asks an LLM 'is this an injection?', compares against a vector DB of attacks, and plants canary tokens to detect leakage.

Why does it matter?

Prompt injection is the #1 LLM vulnerability. Rebuff catches attacks that simple filters miss and gets smarter over time.

Who is it for?

Anyone building LLM apps that accept user input.

Try it

pip install rebuff