OpenAI · 2026-05-08 · notable

Running Codex Safely at OpenAI — Sandbox, Approval Policy, Auto-Review, and Agent-Native Telemetry

OpenAI publishes how its Security team deploys Codex internally: a sandbox boundary that gates writes and network reach, a per-action approval policy with an Auto-review subagent for routine asks, and agent-native telemetry that feeds an AI-powered security triage agent.

Header graphic from OpenAI's 'Running Codex safely at OpenAI' blog post.

OpenAI's Security team writes up the controls and audit trail it uses to govern Codex when the agent acts on real workflows.

What is it?

A blog post from OpenAI explaining how its Codex coding agent is run in production: where it can write, when it must ask for approval, which networks it can reach, and how every action gets logged. Codex is OpenAI's CLI/IDE/cloud agent that can autonomously read repos, run commands, and edit files on behalf of engineers.

How does it work?

Codex executes inside a sandbox that defines writable paths, network reach, and protected directories. An approval policy decides which actions need a human sign-off, and an Auto-review subagent receives the planned action plus recent context and pre-clears routine asks so developers are not stopped on every benign command. Codex also emits agent-native telemetry that an AI-powered security triage agent ingests alongside endpoint alerts, so reviewers see the intent behind each action — not just the artifact left behind.

Why does it matter?

Coding agents now act on behalf of developers with credentials and shell access, and security teams have very few public templates for how to wrap one. This is one of the first concrete writeups from a frontier lab about the operational boundary — sandbox plus approvals plus telemetry — that has to sit between an LLM agent and a production environment before it ships to a security-minded enterprise.

Who is it for?

platform and security engineers deploying coding agents

Try it

https://developers.openai.com/codex/security