AI/TLDR

Higress

An AI-native API gateway on Istio and Envoy for LLM traffic and MCP servers

Overview

Higress is a cloud-native API gateway built on Istio and Envoy. It can be extended with Wasm plugins written in Go, Rust, or JavaScript, and ships with dozens of ready-to-use plugins plus a built-in web console. It started inside Alibaba to handle long-connection services and gRPC/Dubbo load balancing, and now also acts as an AI gateway.

Its AI gateway features sit in front of LLM APIs from mainstream model providers, both international and domestic. Teams use it to route, proxy, and govern traffic to those models from one place, applying authentication, rate limiting, and observability without changing each backend.

Higress also hosts MCP (Model Context Protocol) servers through the same plugin mechanism, so AI agents can call tools and services through the gateway. With the companion openapi-to-mcp tool you can turn an OpenAPI spec into a remote MCP server. It fits the LLM gateway / proxy category by giving both LLM APIs and MCP APIs unified management.

What it does

  • Cloud-native API gateway based on Istio and Envoy, extensible with Wasm plugins in Go, Rust, or JavaScript
  • AI proxy support for mainstream LLM providers, both international and domestic, behind one endpoint
  • Hosts MCP servers via its plugin mechanism so AI agents can call external tools and services
  • openapi-to-mcp tool converts OpenAPI specs into remote MCP servers for hosting
  • Unified authentication, fine-grained rate limiting, audit logs, and observability for LLM and MCP traffic
  • Out-of-the-box web console and dozens of general-purpose plugins; runs from a single Docker image or via Helm on Kubernetes

Getting started

Higress can be started with just Docker, which is convenient for local learning or simple setups. For Kubernetes, install it with Helm.

Run Higress with Docker

Create a working directory and start the all-in-one image. Configuration files are written to the working directory.

bashbash
mkdir higress; cd higress
docker run -d --rm --name higress-ai -v ${PWD}:/data \
        -p 8001:8001 -p 8080:8080 -p 8443:8443  \
        higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/all-in-one:latest

Open the console and gateway ports

Port 8001 serves the Higress UI console, port 8080 is the HTTP gateway entry, and port 8443 is the HTTPS gateway entry.

Install on Kubernetes with Helm (optional)

For K8s deployments, install with Helm. The global.hub parameter lets you pick a mirror registry closer to your region.

bashbash
helm install higress -n higress-system higress.io/higress \
  --set global.hub=higress-registry.us-west-1.cr.aliyuncs.com --create-namespace

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Put a single gateway in front of multiple LLM providers and apply rate limiting, auth, and audit logs to AI traffic
  • Host MCP servers so AI agents can reach tools and services through one managed endpoint
  • Convert existing OpenAPI specs into remote MCP servers with the openapi-to-mcp tool
  • Run a cloud-native API gateway for general HTTP/gRPC/Dubbo services, extended with Wasm plugins

How Higress compares

Higress alongside other open-source gateways & routing tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
LiteLLM★ 50.9kA Python SDK and proxy server that gives one OpenAI-compatible API to 100+ LLM providers, with cost tracking, budgets, fallbacks, rate limiting, and an admin UI.
Apache APISIX★ 16.8kA cloud-native API gateway whose AI plugins add multi-provider LLM proxying, load balancing, retries and fallbacks, token-based rate limiting, and content moderation.
Portkey AI Gateway★ 12.1kAn LLM gateway that routes calls to 100+ providers through one API and adds logging, tracing, caching, and fallbacks for production AI traffic.
Higress★ 8.7kAn AI-native API gateway on Istio and Envoy for LLM traffic and MCP servers
Plano (formerly Arch Gateway)★ 6.6kAn Envoy-based proxy and data plane for agentic apps that handles prompt routing between agents, guardrails, unified access to LLMs, and observability.
Bifrost★ 5.9kA high-throughput LLM gateway written in Go that gives a single OpenAI-compatible API to many providers, with failover, load balancing, semantic caching, and very low overhead at high request rates.
RouteLLM★ 5kA framework from LMSYS for serving and evaluating LLM routers that sends easy queries to cheaper models and hard ones to stronger models to cut cost.
vLLM Semantic Router★ 4.5kAn intelligent router that inspects each request and sends it to the most suitable model in a mixture-of-models setup across cloud, data center, and edge.