Bifrost

One OpenAI-compatible API for 23+ LLM providers, with failover and caching

github.com/maximhq/bifrost★ 5.9k getmaxim.ai/bifrost

Overview

Bifrost is an open-source AI gateway written in Go that puts a single, OpenAI-compatible API in front of more than 23 model providers, including OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure, Mistral, Ollama, and Groq. Your application talks to one endpoint, and Bifrost routes each request to the provider you name in the model string.

It is aimed at teams who call several LLM providers and want one place to handle keys, failover, and routing instead of wiring each SDK separately. Because the API matches the OpenAI format, it can act as a drop-in replacement for existing OpenAI or Anthropic client code with little change.

As an LLM gateway, Bifrost sits between your app and the providers. It adds automatic fallbacks, load balancing across keys and providers, semantic caching, and observability, and ships with a built-in web UI for visual configuration and monitoring.

What it does

Single OpenAI-compatible API in front of 23+ providers (OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure, Cohere, Mistral, Ollama, Groq, and more)
Automatic fallbacks between providers and models, plus load balancing across multiple API keys
Semantic caching that reuses responses for similar requests to cut cost and latency
Built-in web UI for provider configuration, real-time monitoring, and analytics
Model Context Protocol (MCP) support so models can call external tools like filesystem and web search
Observability with native Prometheus metrics, distributed tracing, and request logging

Getting started

Bifrost runs as an HTTP gateway you can start with npx or Docker, then call with any OpenAI-compatible client.

Start the gateway

Run Bifrost locally with npx, or use the Docker image. Both serve the gateway and the web UI on port 8080.

bashbash

# Install and run locally
npx -y @maximhq/bifrost

# Or use Docker
docker run -p 8080:8080 maximhq/bifrost

Configure via the web UI

Open the built-in web interface to add providers and API keys.

bashbash

# Open the built-in web interface
open http://localhost:8080

Make your first API call

Send a chat completion to the gateway. Set the model as provider/model, for example openai/gpt-4o-mini.

bashbash

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello, Bifrost!"}]
  }'

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Call several LLM providers through one OpenAI-compatible endpoint instead of integrating each SDK separately
Add automatic failover so requests keep working when one provider or model is down
Reduce cost and latency by serving repeated or similar prompts from the semantic cache
Track usage and route traffic across multiple API keys with budgets and rate limits

How Bifrost compares

Bifrost alongside other open-source gateways & routing tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
LiteLLM	★ 50.9k	A Python SDK and proxy server that gives one OpenAI-compatible API to 100+ LLM providers, with cost tracking, budgets, fallbacks, rate limiting, and an admin UI.
Apache APISIX	★ 16.8k	A cloud-native API gateway whose AI plugins add multi-provider LLM proxying, load balancing, retries and fallbacks, token-based rate limiting, and content moderation.
Portkey AI Gateway	★ 12.1k	An LLM gateway that routes calls to 100+ providers through one API and adds logging, tracing, caching, and fallbacks for production AI traffic.
Higress	★ 8.7k	An AI-native API gateway built on Istio and Envoy that proxies and governs traffic to many LLM providers, with token rate limiting, caching, and MCP server hosting.
Plano (formerly Arch Gateway)	★ 6.6k	An Envoy-based proxy and data plane for agentic apps that handles prompt routing between agents, guardrails, unified access to LLMs, and observability.
Bifrost	★ 5.9k	One OpenAI-compatible API for 23+ LLM providers, with failover and caching
RouteLLM	★ 5k	A framework from LMSYS for serving and evaluating LLM routers that sends easy queries to cheaper models and hard ones to stronger models to cut cost.
vLLM Semantic Router	★ 4.5k	An intelligent router that inspects each request and sends it to the most suitable model in a mixture-of-models setup across cloud, data center, and edge.

// Overview

// What it does

// Getting started

Start the gateway

Configure via the web UI

Make your first API call

// When to use it

// How Bifrost compares

Overview

What it does

Getting started

When to use it

How Bifrost compares