Overview
Apache APISIX is a dynamic, real-time API gateway that handles traffic management tasks such as load balancing, dynamic upstreams, canary releases, circuit breaking, authentication, and observability. It runs anywhere from bare-metal to Kubernetes and can also act as a Kubernetes ingress controller.
Through its plugin system, APISIX works as an AI gateway: it proxies requests to LLM providers, load balances across them, applies retries and fallbacks, and enforces token-based rate limiting and security controls. An mcp-bridge plugin can also expose stdio-based MCP servers as HTTP SSE services.
It fits the LLM gateway / proxy category for teams that already need a general-purpose gateway and want one place to route both traditional API traffic and AI model traffic, without locking into a single vendor.
What it does
- AI proxying and load balancing across LLM upstreams, with retries and fallbacks
- Token-based rate limiting plus authentication and security plugins for AI traffic
- `mcp-bridge` plugin converts stdio-based MCP servers into scalable HTTP SSE services
- Hot updates of configuration and plugins without restarts
- Multi-protocol support: HTTP(S), gRPC, TCP/UDP, MQTT, Dubbo, WebSocket, and HTTP/3 with QUIC
- Fine-grained routing with full-path and prefix matching, plus health checks and circuit breaking
Getting started
Run APISIX locally with the official quickstart script, then create and test a route through the Admin API.
Start APISIX
The quickstart script starts two Docker containers, apisix-quickstart and etcd. Docker is required.
curl -sL https://run.api7.ai/apisix/quickstart | shVerify it is running
Check that APISIX answers on its proxy port 9080.
curl "http://127.0.0.1:9080" --head | grep ServerCreate a route
Use the Admin API on port 9180 to forward /ip to an upstream.
curl -i "http://127.0.0.1:9180/apisix/admin/routes" -X PUT -d '{
"id": "getting-started-ip",
"uri": "/ip",
"upstream": {
"type": "roundrobin",
"nodes": { "httpbin.org:80": 1 }
}
}'Test the route
Send a request through the gateway on port 9080 to confirm the route works.
curl "http://127.0.0.1:9080/ip"Commands and code are distilled from the project's own documentation — always check the official repo for the latest.
When to use it
- Put a single gateway in front of multiple LLM providers, with load balancing, retries, and fallbacks between them
- Enforce token-based rate limits, authentication, and access control on AI model traffic
- Expose stdio-based MCP servers as HTTP SSE services using the mcp-bridge plugin
- Run one gateway for both traditional north-south API traffic and AI traffic, including as a Kubernetes ingress controller
How Apache APISIX compares
Apache APISIX alongside other open-source gateways & routing tools AI/TLDR tracks, ranked by GitHub stars.
| Tool | Stars | What it does |
|---|---|---|
| LiteLLM | ★ 50.9k | A Python SDK and proxy server that gives one OpenAI-compatible API to 100+ LLM providers, with cost tracking, budgets, fallbacks, rate limiting, and an admin UI. |
| Apache APISIX | ★ 16.8k | Cloud-native API gateway with AI plugins for proxying and governing LLM traffic |
| Portkey AI Gateway | ★ 12.1k | An LLM gateway that routes calls to 100+ providers through one API and adds logging, tracing, caching, and fallbacks for production AI traffic. |
| Higress | ★ 8.7k | An AI-native API gateway built on Istio and Envoy that proxies and governs traffic to many LLM providers, with token rate limiting, caching, and MCP server hosting. |
| Plano (formerly Arch Gateway) | ★ 6.6k | An Envoy-based proxy and data plane for agentic apps that handles prompt routing between agents, guardrails, unified access to LLMs, and observability. |
| Bifrost | ★ 5.9k | A high-throughput LLM gateway written in Go that gives a single OpenAI-compatible API to many providers, with failover, load balancing, semantic caching, and very low overhead at high request rates. |
| RouteLLM | ★ 5k | A framework from LMSYS for serving and evaluating LLM routers that sends easy queries to cheaper models and hard ones to stronger models to cut cost. |
| vLLM Semantic Router | ★ 4.5k | An intelligent router that inspects each request and sends it to the most suitable model in a mixture-of-models setup across cloud, data center, and edge. |