Overview
PentAGI (Penetration testing Artificial General Intelligence) is an open-source tool for automated security testing. It is built for security professionals, researchers, and enthusiasts who want a powerful and flexible way to run penetration tests with the help of AI.
The platform runs as a self-hosted stack and is fully autonomous: an AI-driven agent decides which steps to take and then carries them out. Every operation runs inside a sandboxed Docker environment for isolation, and a team of specialized agents handles research, development, and infrastructure tasks. PentAGI works with 10+ LLM providers, including OpenAI, Anthropic, Google Gemini, AWS Bedrock, and local models through Ollama.
What it does
- Fully autonomous AI agent that determines and executes penetration testing steps with optional execution monitoring and task planning
- All operations run in a sandboxed, isolated Docker environment for safe execution
- Built-in suite of 20+ professional security tools, including nmap, metasploit, and sqlmap
- Team of specialized agents for research, development, and infrastructure tasks, with smart memory and a Graphiti knowledge graph for context
- Works with 10+ LLM providers (OpenAI, Anthropic, Gemini, AWS Bedrock, Ollama, and more) plus aggregators like OpenRouter and DeepInfra
- Detailed vulnerability reports with exploitation guides, plus REST and GraphQL APIs with Bearer token authentication
Getting started
PentAGI is deployed with Docker Compose. You need Docker and Docker Compose, at least 2 vCPU, 4GB RAM, and 20GB of free disk space. An interactive installer is recommended, but you can also set it up manually.
Create a working directory
Make a folder for the PentAGI stack and move into it.
mkdir pentagi && cd pentagiDownload and fill in the environment file
Copy the example .env file, then add at least one LLM provider key (such as OPEN_AI_KEY, ANTHROPIC_API_KEY, or GEMINI_API_KEY) and update the security-related variables.
curl -o .env https://raw.githubusercontent.com/vxcontrol/pentagi/master/.env.exampleRun the PentAGI stack
Download the docker-compose file and start all services in the background.
curl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose.yml
docker compose up -dOpen the web UI
Visit https://localhost:8443 to reach the PentAGI web interface. The default login is admin@pentagi.com / admin (change it right away).
Commands and code are distilled from the project's own documentation — always check the official repo for the latest.
When to use it
- Run autonomous penetration tests against a target system, letting the AI agent plan and execute the steps in an isolated sandbox
- Generate detailed vulnerability reports with exploitation guides for security assessments
- Automate and integrate security testing into other systems through the REST and GraphQL APIs
- Self-host a private, controlled AI pentesting platform that keeps all data and execution on your own infrastructure
How PentAGI compares
PentAGI alongside other open-source multi-agent systems tools AI/TLDR tracks, ranked by GitHub stars.
| Tool | Stars | What it does |
|---|---|---|
| MetaGPT | ★ 68.9k | A multi-agent framework that models a software company, assigning roles like product manager, architect, and engineer to generate code from a single prompt. |
| AutoGen | ★ 59.1k | Microsoft Research's framework for building applications where multiple agents converse with each other and with tools to solve tasks. |
| CrewAI | ★ 54k | A framework for assembling teams ('crews') of role-playing agents that divide tasks and collaborate to complete a goal. |
| OpenAI Agents SDK | ★ 27.3k | OpenAI's lightweight Python SDK for building multi-agent workflows using explicit handoffs, tools, and guardrails. |
| AgentScope | ★ 27k | A framework for building multi-agent applications with message passing, visual debugging tools, and distributed execution. |
| OpenAI Swarm | ★ 21.7k | An educational, lightweight framework from OpenAI for experimenting with multi-agent coordination through handoffs and routines. |
| PentAGI | ★ 17.8k | Autonomous AI agent that runs penetration tests in an isolated Docker sandbox |
| CAMEL | ★ 17.2k | A research-oriented framework for studying and building communicating agents that cooperate through role-playing conversations. |