In plain English
An API key is a secret password that proves your app is allowed to call a paid service — an LLM provider like OpenAI or Anthropic, an image model, a search API. Whoever holds the key can spend money on your account. That is the whole point to remember: a key is not a username, it is a credit card.

The classic beginner mistake is to paste that key straight into the code that runs in the user's browser — a React component, a <script> tag, a fetch call from the front end. It feels like it works: the app calls the model and answers appear. But everything the browser runs is fully visible to anyone who opens it. The key is right there in the page.
Think of a hotel. The front desk (your browser-facing app) talks to guests, but it never hands a guest the master key to the vault. The vault key lives in a back office (your server) that guests never enter. They ask the desk for something; the desk walks to the back, uses the key, and brings back only the result. In an AI app the rule is identical: the secret key stays on a server you control, and the browser only ever talks to your server, never to the model provider directly.
Why it matters
A leaked LLM key is not a small bug. Modern models are billed per token, and a stolen key has no spending conscience. Bots constantly scrape public sites and code repositories looking for keys; the gap between "I pushed my key" and "someone is mining with it" is often minutes, not days.
What actually goes wrong
- A surprise bill. An attacker fires thousands of expensive requests through your key. Some people have woken up to four- and five-figure invoices from a key that sat in a public repo overnight.
- Your app breaks for real users. Once the key hits its rate or spend limit, your own customers get errors — the abuse starves the service you were paying for.
- Data and reputation risk. A stolen key can be used to generate abusive content under your account, or to probe other resources the key happens to unlock.
- Quiet, slow drain. Not every theft is loud. A patient attacker can sip a little every day, and you only notice on the monthly statement.
Why is this so easy to get wrong? Because the insecure version is the path of least resistance. Tutorials show a single-file demo that calls the model from the browser. It runs on your laptop, looks finished, and you ship it. The security hole is invisible until the bill arrives. Knowing the backend-proxy pattern up front is what separates a toy demo from something you can safely put online — it is part of the modern AI app stack every builder should know.
How it works
The fix is a pattern called the backend proxy (or server-side proxy). Instead of the browser calling the model provider, the browser calls your own server, and your server — holding the secret key — calls the provider on its behalf. The key lives only where users cannot see it.
Compare the two architectures. The only structural change is inserting a server you control between the browser and the provider — but that one hop is what keeps the key private.
- Key shipped inside front-end code
- Anyone can read it in DevTools
- Scraped from your site or repo in minutes
- Attacker spends your money directly
- No way to add limits or auth
- Key lives only on the server
- Browser never sees it
- Server adds auth + rate limits
- You can log, cap, and rotate
- Standard production pattern
The request flow, step by step
The server reads the key from an environment variable — a value set outside your source code, in the runtime environment — so the secret is never written into a file you might commit and push. Here is a minimal proxy endpoint. Notice the browser sends only the prompt; the key is added server-side.
import Anthropic from "@anthropic-ai/sdk";
// Key comes from the ENVIRONMENT, never hardcoded, never sent to the browser.
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
export async function POST(req: Request) {
// 1) Authenticate YOUR user (session cookie, token, etc.).
const user = await getUser(req);
if (!user) return new Response("Unauthorized", { status: 401 });
// 2) Enforce a per-user limit BEFORE spending any money.
if (await overLimit(user.id)) {
return new Response("Rate limit exceeded", { status: 429 });
}
// 3) Take only the prompt from the browser — never the key.
const { prompt } = await req.json();
// 4) The secret key is attached here, on the server, out of sight.
const msg = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 500,
messages: [{ role: "user", content: prompt }],
});
return Response.json({ text: msg.content[0].text });
}// No API key anywhere in here. The browser calls your own endpoint.
const res = await fetch("/api/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ prompt: userInput }),
});
const { text } = await res.json();Storing the secret: env vars vs secret managers
Once the key lives only on the server, the next question is where to put it. Two levels are worth knowing, and you graduate from the first to the second as your app grows.
Environment variables (where everyone starts)
An environment variable is a setting your hosting platform injects when the code runs. Locally you keep it in a .env file; in production you paste it into your host's dashboard. The two iron rules:
- Never commit
.envto git. Add it to.gitignoreon day one. A key in your git history is leaked even if you delete it later, because old commits keep it. - Set production secrets in the host's dashboard, not in any file you deploy. Vercel, Netlify, Render, Railway, Fly, and the cloud providers all have an "environment variables" or "secrets" panel.
# Keep secrets out of version control
.env
.env.local
.env.*.localSecret managers (when you scale)
A secret manager is a dedicated service that stores secrets encrypted and hands them to your app at runtime — examples are AWS Secrets Manager, Google Secret Manager, HashiCorp Vault, and Doppler. They add things plain env vars can't: access control over who and what can read a secret, an audit log of every access, and one-click rotation. For a solo project, env vars are fine; reach for a secret manager when you have a team, multiple services, or compliance needs.
| Approach | Good for | Watch out for |
|---|---|---|
.env file (local only) | Development on your laptop | Must be git-ignored; never deploy it |
| Host dashboard env vars | Most small/medium apps | No audit log; shared across the project |
| Secret manager | Teams, many services, compliance | More setup; another service to run |
Rotation and per-user limits: damage control
Hiding the key prevents the obvious leak. The next mindset shift is assuming a leak can still happen and making sure it can't ruin you. Two habits do most of the work: rotating keys and capping spend.
Key rotation
Rotation means replacing a key with a fresh one and retiring the old one. You rotate on a schedule (say, every few months) and immediately whenever you suspect exposure — a key pushed to git, a laptop lost, a contractor leaving. The flow is simple: create the new key, update the env var / secret manager, deploy, confirm it works, then revoke the old key. Because the key lives in one place (your server config), rotating it is a config change, not a code change.
Per-user rate limits and spend caps
Even with a perfectly hidden key, a single abusive user of your app can run up a huge bill by hammering the expensive model. Because every request now flows through your server, you have the perfect place to add limits. Layer a few defenses:
- Require auth so requests are tied to a real user you can throttle or ban — anonymous endpoints are an open faucet.
- Rate-limit per user, e.g. N requests per minute and a daily cap, returning HTTP
429when exceeded. - Cap the cost per call with
max_tokensand by limiting input length, so no single request can balloon. - Set a hard spend limit at the provider (most dashboards offer a monthly budget and alerts) as a final backstop your code can't bypass.
Going deeper
The backend-proxy pattern plus env vars, rotation, and limits covers the vast majority of real AI apps. A few more ideas matter once you go beyond a first project.
Streaming through the proxy. AI apps usually stream tokens for a snappy feel. Your proxy can stream too: the server opens a streaming call to the provider and pipes the chunks straight to the browser. The key still never leaves the server — you are just forwarding a response, not the credentials. This pairs with the latency techniques in designing for LLM latency.
Scoped and short-lived keys. Some providers let you create restricted keys (limited to certain models, projects, or spend) or issue short-lived tokens. Prefer the least-privileged key that still works, so a leak unlocks as little as possible.
Don't trust the client's other inputs either. Once you proxy, remember the browser can send anything. Validate the prompt size and shape on the server, and never let the client choose an arbitrary model or pass through raw provider parameters that could raise your cost. The server decides the model and the ceilings.
Secrets in logs and errors. A surprising number of leaks come from logging a full request object or printing an error that includes headers. Scrub secrets from logs, and be careful what you return to the browser in an error response.
The mobile and desktop case. A compiled mobile or desktop app feels more private than a web page, but it is not — keys can be extracted from app binaries too. The same rule holds everywhere: the secret belongs on a server, and the client gets a result, never the key. If you are retrofitting an existing product, see adding AI to an existing app.
The durable lesson is one sentence: a secret that reaches the browser is not a secret. Build every AI app so the only thing the user's device ever holds is the prompt and the answer — and you will never wake up to a stranger's bill.
FAQ
How do I hide my OpenAI (or Anthropic) API key in a React/frontend app?
You can't hide it in the frontend — anything the browser downloads is readable. The real fix is to not put the key in the browser at all. Move the model call to a small backend or serverless function that reads the key from an environment variable, and have your React app call that endpoint instead of the provider.
Is it ever safe to call the LLM API directly from the browser?
Not with your real secret key — it would be public the moment the page loads. The only browser-safe variants are when the provider issues a short-lived, narrowly scoped token minted by your server for that one session, or when you front the call with your own proxy. The long-lived secret always stays server-side.
Where should I store my API key in production?
In environment variables set through your hosting platform's dashboard (Vercel, Netlify, Render, a cloud provider, etc.) — never in a file you commit to git. As you scale to a team or many services, graduate to a secret manager like AWS Secrets Manager, Google Secret Manager, or Doppler for access control and audit logs.
I accidentally pushed my API key to GitHub — what do I do?
Assume it's already stolen. Revoke or rotate the key in the provider's dashboard immediately so the old one stops working, then create a new key and update your environment variables. Deleting the commit is not enough — the key remains in git history, and bots scan public repos within minutes.
How do I stop one user from running up a huge bill on my AI app?
Because every request goes through your server, add defenses there: require authentication, rate-limit each user (per minute and per day), cap each call with max_tokens, and set a hard monthly spend limit plus billing alerts in the provider dashboard as a final backstop.
What is a backend proxy for an AI app?
It's a small server endpoint that sits between your browser and the model provider. The browser sends only the prompt to your endpoint; the endpoint adds the secret key (from an env var), calls the provider, and returns just the answer. The key lives only on the server, so users never see it.