AI/TLDR

Portkey AI Gateway

Route to hundreds of LLMs through one fast API, with retries, fallbacks, and guardrails

Overview

Portkey AI Gateway is an open-source service that sits between your application and language model providers. You send requests to the gateway using one consistent API, and it forwards them to whichever provider you choose - OpenAI, Anthropic, Bedrock, Groq, and many others - so you can switch models without rewriting your client code.

It is built for teams running AI in production. On top of routing, the gateway adds reliability features like automatic retries and fallbacks, plus routing rules and guardrails you configure per request. The README reports it is lightweight (around 122kb) and adds under 1ms of latency.

As an LLM observability and gateway tool, it gives you a single place to watch your traffic. The gateway ships with a local console that shows your logs, and it integrates with the OpenAI SDKs, LangChain, LlamaIndex, and agent frameworks so it slots into existing stacks.

What it does

  • One API that routes to 250+ language, vision, audio, and image models across many providers
  • Automatic retries and fallbacks to prevent downtime in production traffic
  • Load balancing and conditional routing to scale AI workloads
  • Guardrails you attach per request, such as denying responses that contain specified words
  • A built-in Gateway Console for viewing local request logs in one place
  • Runs locally via npx, or deploys to Docker, Node.js, Cloudflare Workers, and Replit

Getting started

Run the gateway locally with npx, then send your first request through it using the Portkey Python client. You need Node.js and npm installed.

Start the gateway locally

Run the gateway with npx. It starts on http://localhost:8787/v1, with the console at http://localhost:8787/public/.

bashbash
# Run the gateway locally (needs Node.js and npm)
npx @portkey-ai/gateway

Install the Python client

Install the portkey-ai package to call the gateway from Python.

bashbash
pip install -qU portkey-ai

Make your first request

Create a client, point it at a provider, pass that provider's API key, and send a chat completion.

pythonpython
from portkey_ai import Portkey

# OpenAI compatible client
client = Portkey(
    provider="openai",  # or 'anthropic', 'bedrock', 'groq', etc
    Authorization="sk-***"  # the provider API key
)

# Make a request through your AI Gateway
client.chat.completions.create(
    messages=[{"role": "user", "content": "What's the weather like?"}],
    model="gpt-4o-mini"
)

Add routing and guardrails

Attach a config to the client to set retry rules and guardrails. This example retries up to 5 times and denies any reply containing the word Apple.

pythonpython
config = {
  "retry": {"attempts": 5},
  "output_guardrails": [{
    "default.contains": {"operator": "none", "words": ["Apple"]},
    "deny": True
  }]
}

# Attach the config to the client
client = client.with_options(config=config)

client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Reply randomly with Apple or Bat"}]
)

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Swap between LLM providers (OpenAI, Anthropic, Bedrock, Groq) without changing your application code
  • Keep an AI feature running through provider outages using automatic retries and fallbacks
  • Apply guardrails to model output, such as blocking responses that contain certain words
  • Inspect and debug your LLM request logs from a single local console

How Portkey AI Gateway compares

Portkey AI Gateway alongside other open-source gateways & routing tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
LiteLLM★ 50.9kA Python SDK and proxy server that gives one OpenAI-compatible API to 100+ LLM providers, with cost tracking, budgets, fallbacks, rate limiting, and an admin UI.
Apache APISIX★ 16.8kA cloud-native API gateway whose AI plugins add multi-provider LLM proxying, load balancing, retries and fallbacks, token-based rate limiting, and content moderation.
Portkey AI Gateway★ 12.1kRoute to hundreds of LLMs through one fast API, with retries, fallbacks, and guardrails
Higress★ 8.7kAn AI-native API gateway built on Istio and Envoy that proxies and governs traffic to many LLM providers, with token rate limiting, caching, and MCP server hosting.
Plano (formerly Arch Gateway)★ 6.6kAn Envoy-based proxy and data plane for agentic apps that handles prompt routing between agents, guardrails, unified access to LLMs, and observability.
Bifrost★ 5.9kA high-throughput LLM gateway written in Go that gives a single OpenAI-compatible API to many providers, with failover, load balancing, semantic caching, and very low overhead at high request rates.
RouteLLM★ 5kA framework from LMSYS for serving and evaluating LLM routers that sends easy queries to cheaper models and hard ones to stronger models to cut cost.
vLLM Semantic Router★ 4.5kAn intelligent router that inspects each request and sends it to the most suitable model in a mixture-of-models setup across cloud, data center, and edge.