Katana

A fast Go crawler that maps every URL, endpoint, and JS file on a target site

github.com/projectdiscovery/katana★ 17.1k docs.projectdiscovery.io/opensource/katana/overview

Overview

Katana is a command-line crawling and spidering framework written in Go, built by ProjectDiscovery. You point it at a URL or a list of URLs, and it walks the site to discover every reachable URL, endpoint, and JavaScript file.

It is aimed at security testers, bug bounty hunters, and engineers who need to map a site's attack surface or build automation pipelines. Katana runs in a standard HTTP mode or a headless (real browser) mode, and can parse JavaScript files to find endpoints that a plain crawler would miss.

Within the web scraping and crawling category, Katana focuses on speed and scriptability. It reads input from STDIN, a single URL, or a file list, and writes output to STDOUT, a file, or JSON, so it fits cleanly into shell pipelines alongside other tools.

What it does

Standard and headless (real browser) crawling modes
JavaScript parsing and crawling to surface hidden endpoints
Scope control via preconfigured fields or custom regex
Customizable automatic form filling (experimental)
Flexible input (STDIN, URL, list) and output (STDOUT, file, JSON)
Configurable crawl depth, duration, and per-domain page limits

Getting started

Katana requires Go 1.26+ to install from source, or you can pull a prebuilt Docker image. Once installed, you can crawl a target with a single command.

Install with Go

Install the latest Katana binary using the Go toolchain. CGO must be enabled.

bashbash

CGO_ENABLED=1 go install github.com/projectdiscovery/katana/cmd/katana@latest

Or pull the Docker image

If you prefer not to build from source, pull the official image.

bashbash

docker pull projectdiscovery/katana:latest

Crawl a target

Pass a URL with -u to start a standard crawl. Add -headless for browser-based crawling.

bashbash

katana -u https://tesla.com

Check the available flags

Run the help command to see all supported switches, including depth, JS crawling, and scope control.

bashbash

katana -h

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Mapping a web application's URLs and endpoints during a security assessment or bug bounty
Extracting endpoints from JavaScript files that a basic crawler would miss
Feeding discovered URLs into automation pipelines via STDIN/STDOUT and JSON output
Crawling single-page apps and JS-heavy sites in headless mode with a real browser

How Katana compares

Katana alongside other open-source web scraping & crawling tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Firecrawl	★ 135k	A crawling service and API that converts whole websites into clean Markdown or structured JSON ready for LLMs.
Crawl4AI	★ 68.9k	A local-first Python web crawler that turns pages into clean Markdown for use in RAG and LLM pipelines.
Scrapling	★ 65k	A Python web scraping framework whose parser relocates your elements when pages change, with stealthy fetchers and a Scrapy-like spider engine for full crawls.
Scrapy	★ 62.3k	A mature Python framework for writing fast spiders that crawl websites and extract structured data at scale.
ScrapeGraphAI	★ 27.4k	A Python library that uses LLMs and a graph pipeline to extract data from pages based on natural-language prompts.
Colly	★ 25.3k	A Go scraping framework for building fast crawlers with request handling, callbacks, and rate limiting.
Crawlee	★ 23.8k	A Node.js/TypeScript scraping library with proxy rotation and browser fingerprinting for building reliable crawlers.
Katana	★ 17.1k	A fast Go crawler that maps every URL, endpoint, and JS file on a target site

// Overview

// What it does

// Getting started

Install with Go

Or pull the Docker image

Crawl a target

Check the available flags

// When to use it

// How Katana compares

Overview

What it does

Getting started

When to use it

How Katana compares