Plano (formerly Arch Gateway)

An Envoy-based proxy and data plane for agentic apps

Overview

Plano (formerly Arch Gateway) is an out-of-process proxy server and data plane for agentic applications. Built on Envoy by its core contributors, it moves the repetitive plumbing of production agents - routing between agents, guardrail and moderation hooks, LLM access, and observability - out of your framework and into a separate data plane you configure with YAML.

It is aimed at teams who can build an agent demo quickly but struggle to ship it safely and repeatably. Instead of writing intent classifiers, model fallbacks, provider adapters, and tracing glue in every codebase, you declare your agents and model providers once and let Plano handle the wiring. Your agents stay as plain HTTP services in any language or framework.

As an LLM gateway and proxy, Plano sits between your services and the models. It can route by model name, by alias, or automatically by preference, capture OTEL traces and metrics with no extra code, and apply moderation and memory policies through filter chains.

What it does

Low-latency orchestration between agents - add new agents without changing application code
Smart LLM routing: route by model name, semantic alias, or automatically by preference
Zero-code capture of agentic signals plus OpenTelemetry traces and metrics across every agent
Guardrail filter chains for jailbreak protection, moderation policies, and memory consistency
Unified, OpenAI-compatible access to multiple LLM providers (OpenAI, Anthropic, and more)
Built on Envoy as a separate out-of-process data plane, so it works with any language or framework

Getting started

Plano runs as a separate proxy configured by a YAML file. You declare your agents and model providers, run your agents as plain HTTP services, then start Plano and query it.

Install Plano and set up your environment

Follow the prerequisites and quickstart guide in the docs to install the Plano CLI and configure access. The README points to docs.planoai.dev for the exact install steps.

Define your agents in YAML

Declare agent URLs, model providers, and an agent listener. Plano handles routing, fallbacks, and tracing from this config.

yamlyaml

# config.yaml
version: v0.3.0

agents:
  - id: weather_agent
    url: http://localhost:10510
  - id: flight_agent
    url: http://localhost:10520

model_providers:
  - model: openai/gpt-4o
    access_key: $OPENAI_API_KEY
    default: true
  - model: anthropic/claude-3-5-sonnet
    access_key: $ANTHROPIC_API_KEY

listeners:
  - type: agent
    name: travel_assistant
    port: 8001
    router: plano_orchestrator_v1
    agents:
      - id: weather_agent
        description: |
          Gets real-time weather and forecasts for any city worldwide.
      - id: flight_agent
        description: |
          Searches flights between airports with live status and schedules.

tracing:
  random_sampling: 100

Start Plano and query your agents

Start the proxy with your config file, then send an OpenAI-compatible chat completion request. Plano routes the request to the right agent.

bashbash

# Start Plano
planoai up config.yaml

# Query - Plano routes to the right agent
curl http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o"}'

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Routing a single conversation across multiple specialized agents (for example a weather agent and a flight agent) without hard-coding routing logic
Giving services unified, OpenAI-compatible access to several LLM providers with automatic model fallback
Adding jailbreak protection, moderation, and memory policies to an agentic app through filter chains instead of bespoke code
Capturing traces, metrics, and agentic signals across all agents for evaluation and continuous improvement

How Plano (formerly Arch Gateway) compares

Plano (formerly Arch Gateway) alongside other open-source gateways & routing tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
LiteLLM	★ 50.9k	A Python SDK and proxy server that gives one OpenAI-compatible API to 100+ LLM providers, with cost tracking, budgets, fallbacks, rate limiting, and an admin UI.
Apache APISIX	★ 16.8k	A cloud-native API gateway whose AI plugins add multi-provider LLM proxying, load balancing, retries and fallbacks, token-based rate limiting, and content moderation.
Portkey AI Gateway	★ 12.1k	An LLM gateway that routes calls to 100+ providers through one API and adds logging, tracing, caching, and fallbacks for production AI traffic.
Higress	★ 8.7k	An AI-native API gateway built on Istio and Envoy that proxies and governs traffic to many LLM providers, with token rate limiting, caching, and MCP server hosting.
Plano (formerly Arch Gateway)	★ 6.6k	An Envoy-based proxy and data plane for agentic apps
Bifrost	★ 5.9k	A high-throughput LLM gateway written in Go that gives a single OpenAI-compatible API to many providers, with failover, load balancing, semantic caching, and very low overhead at high request rates.
RouteLLM	★ 5k	A framework from LMSYS for serving and evaluating LLM routers that sends easy queries to cheaper models and hard ones to stronger models to cut cost.
vLLM Semantic Router	★ 4.5k	An intelligent router that inspects each request and sends it to the most suitable model in a mixture-of-models setup across cloud, data center, and edge.

// Overview

// What it does

// Getting started