Overview
Higress is a cloud-native API gateway built on Istio and Envoy. It can be extended with Wasm plugins written in Go, Rust, or JavaScript, and ships with dozens of ready-to-use plugins plus a built-in web console. It started inside Alibaba to handle long-connection services and gRPC/Dubbo load balancing, and now also acts as an AI gateway.
Its AI gateway features sit in front of LLM APIs from mainstream model providers, both international and domestic. Teams use it to route, proxy, and govern traffic to those models from one place, applying authentication, rate limiting, and observability without changing each backend.
Higress also hosts MCP (Model Context Protocol) servers through the same plugin mechanism, so AI agents can call tools and services through the gateway. With the companion openapi-to-mcp tool you can turn an OpenAPI spec into a remote MCP server. It fits the LLM gateway / proxy category by giving both LLM APIs and MCP APIs unified management.
What it does
- Cloud-native API gateway based on Istio and Envoy, extensible with Wasm plugins in Go, Rust, or JavaScript
- AI proxy support for mainstream LLM providers, both international and domestic, behind one endpoint
- Hosts MCP servers via its plugin mechanism so AI agents can call external tools and services
- openapi-to-mcp tool converts OpenAPI specs into remote MCP servers for hosting
- Unified authentication, fine-grained rate limiting, audit logs, and observability for LLM and MCP traffic
- Out-of-the-box web console and dozens of general-purpose plugins; runs from a single Docker image or via Helm on Kubernetes
Getting started
Higress can be started with just Docker, which is convenient for local learning or simple setups. For Kubernetes, install it with Helm.
Run Higress with Docker
Create a working directory and start the all-in-one image. Configuration files are written to the working directory.
mkdir higress; cd higress
docker run -d --rm --name higress-ai -v ${PWD}:/data \
-p 8001:8001 -p 8080:8080 -p 8443:8443 \
higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/all-in-one:latestOpen the console and gateway ports
Port 8001 serves the Higress UI console, port 8080 is the HTTP gateway entry, and port 8443 is the HTTPS gateway entry.
Install on Kubernetes with Helm (optional)
For K8s deployments, install with Helm. The global.hub parameter lets you pick a mirror registry closer to your region.
helm install higress -n higress-system higress.io/higress \
--set global.hub=higress-registry.us-west-1.cr.aliyuncs.com --create-namespaceCommands and code are distilled from the project's own documentation — always check the official repo for the latest.
When to use it
- Put a single gateway in front of multiple LLM providers and apply rate limiting, auth, and audit logs to AI traffic
- Host MCP servers so AI agents can reach tools and services through one managed endpoint
- Convert existing OpenAPI specs into remote MCP servers with the openapi-to-mcp tool
- Run a cloud-native API gateway for general HTTP/gRPC/Dubbo services, extended with Wasm plugins
How Higress compares
Higress alongside other open-source gateways & routing tools AI/TLDR tracks, ranked by GitHub stars.
| Tool | Stars | What it does |
|---|---|---|
| LiteLLM | ★ 50.9k | A Python SDK and proxy server that gives one OpenAI-compatible API to 100+ LLM providers, with cost tracking, budgets, fallbacks, rate limiting, and an admin UI. |
| Apache APISIX | ★ 16.8k | A cloud-native API gateway whose AI plugins add multi-provider LLM proxying, load balancing, retries and fallbacks, token-based rate limiting, and content moderation. |
| Portkey AI Gateway | ★ 12.1k | An LLM gateway that routes calls to 100+ providers through one API and adds logging, tracing, caching, and fallbacks for production AI traffic. |
| Higress | ★ 8.7k | An AI-native API gateway on Istio and Envoy for LLM traffic and MCP servers |
| Plano (formerly Arch Gateway) | ★ 6.6k | An Envoy-based proxy and data plane for agentic apps that handles prompt routing between agents, guardrails, unified access to LLMs, and observability. |
| Bifrost | ★ 5.9k | A high-throughput LLM gateway written in Go that gives a single OpenAI-compatible API to many providers, with failover, load balancing, semantic caching, and very low overhead at high request rates. |
| RouteLLM | ★ 5k | A framework from LMSYS for serving and evaluating LLM routers that sends easy queries to cheaper models and hard ones to stronger models to cut cost. |
| vLLM Semantic Router | ★ 4.5k | An intelligent router that inspects each request and sends it to the most suitable model in a mixture-of-models setup across cloud, data center, and edge. |