LiteLLM

Call 100+ LLM providers through one OpenAI-compatible API

github.com/BerriAI/litellm★ 50.9k litellm.ai

Overview

LiteLLM is an open-source AI gateway that gives you a single, unified interface to call 100+ LLM providers, including OpenAI, Anthropic, Gemini, Bedrock, and Azure, all using the OpenAI request format. Instead of juggling a different SDK, auth pattern, and error type for every model, you write your code once and switch providers by changing the model name.

You can use it two ways. As a Python SDK, you import the `completion` function and call any model directly from your application. As an AI Gateway (proxy server), you deploy it as a central service that your whole team points at, with virtual keys, spend tracking, and load balancing handled in one place.

As an LLM gateway, LiteLLM sits between your apps and the model providers. It fits teams that want to standardize how they reach models, keep client code OpenAI-compatible, and add cost tracking, budgets, and fallbacks without rewriting each integration.

What it does

One unified API for 100+ LLMs, so you avoid provider-specific SDKs
Drop-in OpenAI compatibility — swap providers by changing the model string, not your code
Proxy server (AI Gateway) with virtual keys, spend tracking, guardrails, and load balancing
Admin dashboard for managing keys and monitoring usage out of the box
Supports many endpoint types: chat/completions, responses, embeddings, images, audio, batches, and rerank
Can invoke A2A agents (LangGraph, Vertex AI Agent Engine, Bedrock AgentCore, Pydantic AI) via SDK or gateway

Getting started

LiteLLM works as a Python SDK for direct calls or as a proxy server for your whole team. Pick one to start.

Install the Python SDK

Add LiteLLM to your project.

bashbash

uv add litellm

Make your first call

Set the provider API keys you need, then call any model with the OpenAI-format `completion` function. Switch providers by changing the model string.

pythonpython

from litellm import completion
import os

os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key"

# OpenAI
response = completion(model="openai/gpt-4o", messages=[{"role": "user", "content": "Hello!"}])

# Anthropic
response = completion(model="anthropic/claude-sonnet-4-20250514", messages=[{"role": "user", "content": "Hello!"}])

Or run the AI Gateway (proxy server)

Install the proxy extra and start it pointed at a model. It serves an OpenAI-compatible endpoint on port 4000.

bashbash

uv tool install 'litellm[proxy]'
litellm --model gpt-4o

Call the gateway with the OpenAI client

Point any OpenAI client at the local proxy base URL to route requests through LiteLLM.

pythonpython

import openai

client = openai.OpenAI(api_key="anything", base_url="http://0.0.0.0:4000")
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Build an app that can switch between OpenAI, Anthropic, and Gemini without rewriting client code
Run a central gateway so a team shares one endpoint with virtual keys and per-team spend tracking
Add fallbacks and load balancing across providers to keep requests flowing when one model is down
Standardize calls to many endpoint types (chat, embeddings, images, audio, rerank) behind one OpenAI-format API

How LiteLLM compares

LiteLLM alongside other open-source gateways & routing tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
LiteLLM	★ 50.9k	Call 100+ LLM providers through one OpenAI-compatible API
Apache APISIX	★ 16.8k	A cloud-native API gateway whose AI plugins add multi-provider LLM proxying, load balancing, retries and fallbacks, token-based rate limiting, and content moderation.
Portkey AI Gateway	★ 12.1k	An LLM gateway that routes calls to 100+ providers through one API and adds logging, tracing, caching, and fallbacks for production AI traffic.
Higress	★ 8.7k	An AI-native API gateway built on Istio and Envoy that proxies and governs traffic to many LLM providers, with token rate limiting, caching, and MCP server hosting.
Plano (formerly Arch Gateway)	★ 6.6k	An Envoy-based proxy and data plane for agentic apps that handles prompt routing between agents, guardrails, unified access to LLMs, and observability.
Bifrost	★ 5.9k	A high-throughput LLM gateway written in Go that gives a single OpenAI-compatible API to many providers, with failover, load balancing, semantic caching, and very low overhead at high request rates.
RouteLLM	★ 5k	A framework from LMSYS for serving and evaluating LLM routers that sends easy queries to cheaper models and hard ones to stronger models to cut cost.
vLLM Semantic Router	★ 4.5k	An intelligent router that inspects each request and sends it to the most suitable model in a mixture-of-models setup across cloud, data center, and edge.

// Overview

// What it does

// Getting started