Strix

Open-source AI agents that hack your app to find and prove real vulnerabilities

github.com/usestrix/strix★ 26.1k strix.ai

Overview

Strix is an open-source security testing tool built around autonomous AI agents that behave like real hackers. Instead of only scanning code statically, the agents run your application dynamically, look for weaknesses, and confirm each finding with an actual proof-of-concept so you get fewer false positives.

It is aimed at developers and security teams who want fast, accurate testing without the cost of manual pentesting. Strix ships as a developer-first CLI that produces actionable reports, can run teams of agents that collaborate, and plugs into CI/CD so insecure code can be caught before it reaches production.

What it does

Full hacker toolkit out of the box: HTTP proxy, browser automation, terminal shells, and a Python runtime for custom exploits
Real validation with working proof-of-concepts instead of unverified false positives
Detects a wide range of issues: access control (IDOR, privilege escalation), injection (SQL, NoSQL, command), SSRF/XXE, XSS, business-logic and auth flaws
Graph of Agents: multiple specialized agents run in parallel and share discoveries for broad coverage
Works with any supported LLM provider (OpenAI, Anthropic, Google, and others), set via environment variables
Headless non-interactive mode and a GitHub Actions workflow for scanning pull requests in CI/CD

Getting started

Strix needs Docker running and an LLM API key from a supported provider (OpenAI, Anthropic, Google, etc.). You install it with a single script, point it at your AI provider, and run a scan against a local directory, a GitHub repo, or a live URL.

Install Strix

Install the CLI with the official install script.

bashbash

curl -sSL https://strix.ai/install | bash

Configure your AI provider

Set the model and API key as environment variables. Strix saves this configuration to ~/.strix/cli-config.json so you do not have to re-enter it each run.

bashbash

export STRIX_LLM="openai/gpt-5.4"
export LLM_API_KEY="your-api-key"

Run your first security assessment

Point Strix at a local app directory. The first run automatically pulls the sandbox Docker image, and results are saved to strix_runs/<run-name>.

bashbash

strix --target ./app-directory

Scan a repo or live app

You can also target a GitHub repository or a deployed application for a black-box assessment.

bashbash

strix --target https://github.com/org/repo
strix --target https://your-app.com

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Application security testing: detect and validate critical vulnerabilities in your applications
Rapid penetration testing: get pentests done in hours instead of weeks, with compliance reports
Bug bounty automation: automate research and generate proof-of-concepts for faster reporting
CI/CD integration: run tests on pull requests to block vulnerabilities before they reach production

How Strix compares

Strix alongside other open-source evaluation & red-teaming tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Strix	★ 26.1k	Open-source AI agents that hack your app to find and prove real vulnerabilities
promptfoo	★ 22.4k	A developer-first CLI and library for testing and comparing prompts and models, with red-teaming probes for prompt injection, PII leaks, and other vulnerabilities.
OpenAI Evals	★ 18.7k	A framework and open registry for building and running evaluations of LLMs and LLM-based systems, including prompt chains and tool-using agents.
DeepEval	★ 16.3k	An open-source Python framework that tests LLM apps like unit tests, with 50+ metrics for RAG, agents, chatbots, and safety, and a Pytest integration for CI/CD.
Ragas	★ 14.4k	An evaluation toolkit focused on retrieval-augmented generation that scores answer faithfulness, context precision/recall, and relevancy, often without needing ground-truth labels.
Arize Phoenix	★ 10.2k	An open-source observability and evaluation tool for tracing LLM and agent behavior, running evals on traces, and troubleshooting issues in development and production.
garak	★ 8.2k	An LLM vulnerability scanner from NVIDIA with 100+ attack probes that test models for prompt injection, data leakage, jailbreaks, and other security weaknesses.
Giskard	★ 5.4k	An open-source library for testing and scanning LLM and ML models for issues like hallucination, bias, and toxicity, including multi-turn agent testing and a vulnerability scanner.

// Overview

// What it does

// Getting started

Install Strix

Configure your AI provider

Run your first security assessment

Scan a repo or live app

// When to use it

// How Strix compares

Overview

What it does

Getting started

When to use it

How Strix compares