AI/TLDR

What Is OpenAI Codex? Cloud and CLI Coding Agents Explained

Understand what OpenAI Codex is across its cloud and CLI forms, how it differs from Copilot, and how to delegate your first task to it.

BEGINNER10 MIN READUPDATED 2026-06-12

In plain English

OpenAI Codex is a coding agent from OpenAI. You give it a task — "add input validation to the registration form" or "fix the flaky tests in the auth module" — and it goes off and does the work on its own: reads your code, makes edits, runs tests, and hands you a finished diff or pull request when it's done.

OpenAI Codex — diagram
OpenAI Codex — generativeaipub.com

Here is the everyday analogy. An autocomplete tool like GitHub Copilot is a smart colleague looking over your shoulder, finishing the sentence you started. Codex is more like sending that colleague an email that says "take care of this ticket" and coming back an hour later to find a pull request waiting for review. You set the goal; the agent does the mechanical work.

Codex ships in two forms that target different workflows. Codex cloud (the web experience at chatgpt.com/codex) runs tasks asynchronously in OpenAI-managed sandboxes — you can queue several tasks, close your laptop, and review the results later. Codex CLI (@openai/codex) is an open-source command-line tool you install locally and drive interactively from your terminal, closer to the experience of Claude Code or Cursor's agent mode.

Why it matters

Before agents like this, getting AI to help with a multi-step coding task meant a lot of manual shuttling. You'd ask a chatbot for a function, paste it into your editor, run it, copy the error back into the chat, get a fix, paste that in, and repeat. You were the loop — and that loop was exhausting for anything beyond a single function.

Codex closes that loop. The agent writes the change, runs the tests itself, reads the failure, and tries again — all without you in the middle. That single shift — from suggest to act — is what makes coding agents a meaningful productivity change rather than just a fancier autocomplete.

The async advantage

Codex cloud adds something even more valuable: parallelism. Because each task runs in its own sandboxed cloud container, you can queue several tasks simultaneously and work on something else while they run. Most tasks take between one and thirty minutes. When Codex finishes, it surfaces the diff and command logs so you can trace exactly what happened. This asynchronous pattern — assign, context-switch, review — is how power users multiply their throughput.

Who should care

  • Developers with a backlog of chores — writing tests, bumping dependencies, fixing lint, writing database migrations. These are tasks that are well-defined but tedious; Codex handles them while you focus on the interesting work.
  • Teams using GitHub — you can tag @codex on a GitHub issue or pull request to spin up a task and get a proposed change without leaving GitHub.
  • Beginners learning a codebase — Codex can read files, explain what they do, and make a small targeted change, which is a fast way to understand unfamiliar code.
  • Anyone who wants to delegate, not just autocomplete — if you find yourself doing the copy-paste loop described above, a coding agent replaces the loop with a review step.

How it works

Codex is built on codex-1, a version of OpenAI's o3 reasoning model that was further trained on real-world coding tasks using reinforcement learning. The training targeted code that mirrors human PR style, follows instructions precisely, and iterates on failing tests — which is exactly the behavior you want in an agent you'll review rather than dictate to.

The sandbox: why isolation matters

Every Codex cloud task runs inside its own isolated container — a clean virtual machine that is preloaded with your repository and then torn down when the task finishes. This means one task cannot interfere with another, and the agent's actions cannot touch anything outside that container. By default, the agent phase runs with no network access (so it can't call home or exfiltrate data); you can opt into selective internet access per environment if the task genuinely needs to install packages or call an external API.

The agent loop inside the sandbox

Inside the sandbox, Codex runs the same agent loop that powers every coding agent: read context, decide the next action, call a tool (read a file, write a file, run a shell command, run a test), observe the result, repeat. The model sees terminal output, test results, and error messages the same way a developer would. It keeps iterating until it decides the goal is met or it runs out of steps.

AGENTS.md: standing instructions for the agent

Codex reads an AGENTS.md file from your repository before starting any task, the same way you'd give a new teammate a project brief. You put coding standards, build commands, test instructions, and architecture notes in this file, and every task automatically inherits that context. It supports cascading files — a repo-level AGENTS.md plus more specific ones in subdirectories — so a monorepo can give different instructions to different parts of the codebase.

Cloud vs CLI: two ways to use Codex

Codex ships two distinct products that share the same underlying model but offer very different experiences. Knowing which to reach for depends on whether you want asynchronous cloud delegation or interactive local control.

Getting started with Codex CLI

The CLI is available as an npm package and as a native binary. The simplest install on macOS/Linux is the shell installer:

Install Codex CLIbash
# macOS / Linux
curl -fsSL https://chatgpt.com/codex/install.sh | sh

# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://chatgpt.com/codex/install.ps1 | iex"

# Or via npm
npm i -g @openai/codex

# Or via Homebrew
brew install --cask codex

Once installed, navigate to your project and run codex. The first time you launch it, you'll be prompted to sign in with your ChatGPT account or paste an API key. After that you drop into an interactive session where you describe what you want and watch the agent work.

First sessionbash
# Navigate to a project under git version control
cd my-project

# Start interactive session
codex

# Or pass a one-shot prompt directly
codex "add JSDoc comments to all exported functions in src/utils.ts"

Codex vs Copilot vs Claude Code

Three names come up constantly in this space, and they solve different problems. Here is where each fits.

ToolPrimary formHow you interactBest for
GitHub CopilotEditor pluginInline completions + chatLine-by-line assistance as you type
OpenAI CodexCloud agent + CLIAssign tasks, review diffsDelegating whole tasks, async work
Claude CodeTerminal CLIInteractive session or one-shotHands-on agentic sessions in your terminal

GitHub Copilot is the autocomplete layer — it lives inside your editor and finishes lines, suggests functions, and chats while you type. It's not fundamentally an autonomous agent; you're still driving every change. Codex and Claude Code are both coding agents — you describe a goal and the agent executes it. The difference between them is mostly workflow: Codex cloud is optimized for async parallel delegation (queue a task, come back later), while Claude Code is optimized for an interactive session where you're watching and guiding the agent in real time.

In practice, many developers use two or three of these simultaneously: Copilot for in-editor autocomplete while coding, Codex cloud to handle a backlog ticket in the background, and a terminal agent for exploratory sessions. None of them is strictly a replacement for the others.

Going deeper

Once you have the basics working, there are several more capable layers worth knowing about as your usage matures.

Subagents and parallel workers

For large tasks, Codex uses a multi-agent architecture: a manager model that coordinates several parallel worker agents, each with its own context window working on a slice of the problem. A large refactor can be broken into modules, tackled in parallel, and merged — rather than processed serially in one enormous context. This is the same multi-agent pattern that makes cloud tasks faster than you would expect: the system is not limited to one thread of execution.

Internet access in sandboxes

By default, the agent phase in a cloud sandbox runs offline. You can opt into internet access per environment, with granular control over which domains and HTTP methods Codex can use. This is important for tasks that require installing packages from a registry, running integration tests that call an external API, or looking up current documentation. The setup phase (before the agent starts) can always access the network to install dependencies you've pre-specified.

GitHub integration

Codex can connect directly to GitHub. Once you install the Codex GitHub App, you can tag @codex in any issue or pull request comment to spin up a task. The agent reads the issue, opens a branch, makes the changes, runs the tests, and opens a pull request — all without you leaving GitHub. It's the same cloud agent, just triggered from a different surface.

Pricing and access

As of mid-2026, Codex is included in every paid ChatGPT plan: Plus ($20/month) gets 10–60 cloud tasks per five-hour window, and Pro (starting at $100/month) gets 5x or 20x the Plus limits. The CLI is open-source on GitHub under the openai/codex repository and can be used with an API key from any paid account. OpenAI moved to token-based credit billing in April 2026 — lighter tasks cost less than heavier ones, rather than charging a flat per-message rate.

Reviewing output critically

The practical frontier of working with Codex is review quality, not task assignment. The agent is good at well-scoped, testable tasks where it can confirm its own work (run the tests, check that they pass). It struggles with tasks that require judgment calls, understanding unstated business logic, or touching code that has no tests. The output of a Codex task is a starting point for review, not a finished change to merge blindly. Industry surveys in 2026 found that close to 43% of AI-generated changes needed debugging in production — a reminder that the human review step is not optional. Keep the scope tight, write a clear task description (what the goal is and how to verify it), and read every diff before merging.

FAQ

What is OpenAI Codex in simple terms?

OpenAI Codex is a coding agent from OpenAI. You describe a task in plain English, Codex runs it in an isolated environment — reading your code, making edits, running tests — and hands you a diff or pull request when it's done. It comes as both a cloud product (async, parallel tasks) and an open-source CLI tool (interactive, local).

How is OpenAI Codex different from GitHub Copilot?

GitHub Copilot is an autocomplete and chat tool that assists you line by line as you type inside your editor. Codex is an autonomous agent — you give it a whole task and it executes the steps on its own. Copilot makes suggestions; Codex takes actions. Many developers use both together.

How do I install and use the Codex CLI?

Run npm i -g @openai/codex or use the shell installer at https://chatgpt.com/codex/install.sh on macOS/Linux (there is also a PowerShell installer for Windows). Then navigate to a project and run codex to start an interactive session, or codex "your task here" for a one-shot prompt. Authenticate with your ChatGPT account or an API key on first run.

What is the AGENTS.md file in Codex?

AGENTS.md is a plain text file you add to your repository that Codex reads before starting any task — like a standing brief for the agent. Put your build command, test command, coding standards, and project-specific rules in it. Codex supports cascading AGENTS.md files, so a monorepo can have per-directory instructions that override the root-level file.

Is Codex free?

The Codex CLI is open-source and free to download, but you need a paid ChatGPT account or an OpenAI API key to use it. Codex cloud is included in every paid ChatGPT plan — Plus ($20/month), Pro (starting at $100/month), Business, and Enterprise. As of 2026 billing is token-based, so lighter tasks cost less than heavier ones.

How does Codex compare to Claude Code?

Both are coding agents that take whole tasks rather than line-by-line suggestions. Codex cloud is optimized for async parallel delegation — queue multiple tasks, close your laptop, review diffs later. Claude Code is optimized for interactive sessions — you watch the agent work in your terminal in real time, approving risky actions as they come up. Many developers use both depending on the task.

Further reading