AI/TLDR

Apache APISIX

Cloud-native API gateway with AI plugins for proxying and governing LLM traffic

Overview

Apache APISIX is a dynamic, real-time API gateway that handles traffic management tasks such as load balancing, dynamic upstreams, canary releases, circuit breaking, authentication, and observability. It runs anywhere from bare-metal to Kubernetes and can also act as a Kubernetes ingress controller.

Through its plugin system, APISIX works as an AI gateway: it proxies requests to LLM providers, load balances across them, applies retries and fallbacks, and enforces token-based rate limiting and security controls. An mcp-bridge plugin can also expose stdio-based MCP servers as HTTP SSE services.

It fits the LLM gateway / proxy category for teams that already need a general-purpose gateway and want one place to route both traditional API traffic and AI model traffic, without locking into a single vendor.

What it does

  • AI proxying and load balancing across LLM upstreams, with retries and fallbacks
  • Token-based rate limiting plus authentication and security plugins for AI traffic
  • `mcp-bridge` plugin converts stdio-based MCP servers into scalable HTTP SSE services
  • Hot updates of configuration and plugins without restarts
  • Multi-protocol support: HTTP(S), gRPC, TCP/UDP, MQTT, Dubbo, WebSocket, and HTTP/3 with QUIC
  • Fine-grained routing with full-path and prefix matching, plus health checks and circuit breaking

Getting started

Run APISIX locally with the official quickstart script, then create and test a route through the Admin API.

Start APISIX

The quickstart script starts two Docker containers, apisix-quickstart and etcd. Docker is required.

bashbash
curl -sL https://run.api7.ai/apisix/quickstart | sh

Verify it is running

Check that APISIX answers on its proxy port 9080.

bashbash
curl "http://127.0.0.1:9080" --head | grep Server

Create a route

Use the Admin API on port 9180 to forward /ip to an upstream.

bashbash
curl -i "http://127.0.0.1:9180/apisix/admin/routes" -X PUT -d '{
  "id": "getting-started-ip",
  "uri": "/ip",
  "upstream": {
    "type": "roundrobin",
    "nodes": { "httpbin.org:80": 1 }
  }
}'

Test the route

Send a request through the gateway on port 9080 to confirm the route works.

bashbash
curl "http://127.0.0.1:9080/ip"

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Put a single gateway in front of multiple LLM providers, with load balancing, retries, and fallbacks between them
  • Enforce token-based rate limits, authentication, and access control on AI model traffic
  • Expose stdio-based MCP servers as HTTP SSE services using the mcp-bridge plugin
  • Run one gateway for both traditional north-south API traffic and AI traffic, including as a Kubernetes ingress controller

How Apache APISIX compares

Apache APISIX alongside other open-source gateways & routing tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
LiteLLM★ 50.9kA Python SDK and proxy server that gives one OpenAI-compatible API to 100+ LLM providers, with cost tracking, budgets, fallbacks, rate limiting, and an admin UI.
Apache APISIX★ 16.8kCloud-native API gateway with AI plugins for proxying and governing LLM traffic
Portkey AI Gateway★ 12.1kAn LLM gateway that routes calls to 100+ providers through one API and adds logging, tracing, caching, and fallbacks for production AI traffic.
Higress★ 8.7kAn AI-native API gateway built on Istio and Envoy that proxies and governs traffic to many LLM providers, with token rate limiting, caching, and MCP server hosting.
Plano (formerly Arch Gateway)★ 6.6kAn Envoy-based proxy and data plane for agentic apps that handles prompt routing between agents, guardrails, unified access to LLMs, and observability.
Bifrost★ 5.9kA high-throughput LLM gateway written in Go that gives a single OpenAI-compatible API to many providers, with failover, load balancing, semantic caching, and very low overhead at high request rates.
RouteLLM★ 5kA framework from LMSYS for serving and evaluating LLM routers that sends easy queries to cheaper models and hard ones to stronger models to cut cost.
vLLM Semantic Router★ 4.5kAn intelligent router that inspects each request and sends it to the most suitable model in a mixture-of-models setup across cloud, data center, and edge.