PentAGI

Autonomous AI agent that runs penetration tests in an isolated Docker sandbox

github.com/vxcontrol/pentagi★ 17.8k pentagi.com

Overview

PentAGI (Penetration testing Artificial General Intelligence) is an open-source tool for automated security testing. It is built for security professionals, researchers, and enthusiasts who want a powerful and flexible way to run penetration tests with the help of AI.

The platform runs as a self-hosted stack and is fully autonomous: an AI-driven agent decides which steps to take and then carries them out. Every operation runs inside a sandboxed Docker environment for isolation, and a team of specialized agents handles research, development, and infrastructure tasks. PentAGI works with 10+ LLM providers, including OpenAI, Anthropic, Google Gemini, AWS Bedrock, and local models through Ollama.

What it does

Fully autonomous AI agent that determines and executes penetration testing steps with optional execution monitoring and task planning
All operations run in a sandboxed, isolated Docker environment for safe execution
Built-in suite of 20+ professional security tools, including nmap, metasploit, and sqlmap
Team of specialized agents for research, development, and infrastructure tasks, with smart memory and a Graphiti knowledge graph for context
Works with 10+ LLM providers (OpenAI, Anthropic, Gemini, AWS Bedrock, Ollama, and more) plus aggregators like OpenRouter and DeepInfra
Detailed vulnerability reports with exploitation guides, plus REST and GraphQL APIs with Bearer token authentication

Getting started

PentAGI is deployed with Docker Compose. You need Docker and Docker Compose, at least 2 vCPU, 4GB RAM, and 20GB of free disk space. An interactive installer is recommended, but you can also set it up manually.

Create a working directory

Make a folder for the PentAGI stack and move into it.

bashbash

mkdir pentagi && cd pentagi

Download and fill in the environment file

Copy the example .env file, then add at least one LLM provider key (such as OPEN_AI_KEY, ANTHROPIC_API_KEY, or GEMINI_API_KEY) and update the security-related variables.

bashbash

curl -o .env https://raw.githubusercontent.com/vxcontrol/pentagi/master/.env.example

Run the PentAGI stack

Download the docker-compose file and start all services in the background.

bashbash

curl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose.yml
docker compose up -d

Open the web UI

Visit https://localhost:8443 to reach the PentAGI web interface. The default login is admin@pentagi.com / admin (change it right away).

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Run autonomous penetration tests against a target system, letting the AI agent plan and execute the steps in an isolated sandbox
Generate detailed vulnerability reports with exploitation guides for security assessments
Automate and integrate security testing into other systems through the REST and GraphQL APIs
Self-host a private, controlled AI pentesting platform that keeps all data and execution on your own infrastructure

How PentAGI compares

PentAGI alongside other open-source multi-agent systems tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
MetaGPT	★ 68.9k	A multi-agent framework that models a software company, assigning roles like product manager, architect, and engineer to generate code from a single prompt.
AutoGen	★ 59.1k	Microsoft Research's framework for building applications where multiple agents converse with each other and with tools to solve tasks.
CrewAI	★ 54k	A framework for assembling teams ('crews') of role-playing agents that divide tasks and collaborate to complete a goal.
OpenAI Agents SDK	★ 27.3k	OpenAI's lightweight Python SDK for building multi-agent workflows using explicit handoffs, tools, and guardrails.
AgentScope	★ 27k	A framework for building multi-agent applications with message passing, visual debugging tools, and distributed execution.
OpenAI Swarm	★ 21.7k	An educational, lightweight framework from OpenAI for experimenting with multi-agent coordination through handoffs and routines.
PentAGI	★ 17.8k	Autonomous AI agent that runs penetration tests in an isolated Docker sandbox
CAMEL	★ 17.2k	A research-oriented framework for studying and building communicating agents that cooperate through role-playing conversations.

// Overview

// What it does

// Getting started

Create a working directory

Download and fill in the environment file

Run the PentAGI stack

Open the web UI

// When to use it

// How PentAGI compares

Overview

What it does

Getting started

When to use it

How PentAGI compares