// LLM knowledge base for OpenAI Codex — CLI · ChatGPT Codex · Codex Cloud · OAuth backend · API surface · AGENTS.md · MCP
┌──────────────────── CODEX ────────────────────┐ │ │ │ CLI ChatGPT Codex Codex Cloud │ │ (Rust) (web/desktop app) (remote)│ │ │ │ │ │ │ └────────┬────────┴──────────────────┘ │ │ ▼ │ │ shared auth │ │ ChatGPT plan (default) or API-key │ │ │ │ │ ▼ │ │ Responses API │ │ backend-api/codex · /v1/responses │ │ │ │ │ ▼ │ │ gpt-5-codex (default) │ │ gpt-5.x · gpt-4.x variants │ └───────────────┬───────────────────────────────┘ ▼ tool surface shell exec · file edit · Apply Patch MCP (client + server) · web search · browser AGENTS.md guidance · approval modes
Umbrella term for OpenAI's coding-agent product line. Three surfaces share the same backend + guidance conventions. Not to be confused with the 2021 Codex model (code-davinci-002) which was deprecated. Modern Codex is agent-first, not a raw completion model.
Rust-based terminal agent. Installable via npm install -g @openai/codex, brew install codex, or the GitHub release. Runs locally, reads + edits your repo, executes shell commands through an approval layer. Open source.
The agent inside ChatGPT (web, desktop, mobile). Full Codex agent experience without the terminal — edit files in a browser-rendered sandbox, run commands, open pull requests. Paid plans (Plus / Pro / Business / Enterprise) unlock the agent quota.
Remote agent that fans out parallel tasks in isolated sandboxed VMs. Invoke from ChatGPT or via API; the agent works asynchronously and reports back with a diff / PR / artefact. Good for long-running work, multi-branch experiments, overnight runs.
The unified inference endpoint behind Codex. Tool calls, structured outputs, streaming. All three Codex surfaces talk to this same backend; differences are in the harness, not the model path.
chatgpt.com/backend-api/codex — the endpoint Codex CLI hits when authenticated via ChatGPT OAuth instead of an API key. Entitlement-funded against your ChatGPT plan. Not a publicly-supported API surface; returns intermittent empty response.output=[]. Third-party projects that route through it (like the Hermes Pi harness) need self-heal paths.
npm install -g @openai/codex or brew install codex. Ships as a single Rust binary; the npm package wraps it. Platforms: macOS, Linux (x64 + arm64), Windows via WSL2.
codex opens the interactive TUI in the current directory. codex exec "<prompt>" runs a non-interactive single-shot for scripting. Supports stdin piping. --model pins the model; -p <profile> picks a profile from config.toml.
Resume a prior session. codex resume opens an interactive picker; codex resume --last picks the most recent one. Session transcripts live under ~/.codex/sessions/.
Four modes gate shell execution + file writes. read-only (default): every write needs approval. auto: auto-approve edits inside the working dir. full-access: auto-approve everything inside a sandbox. danger-full-access: bypass sandbox (YOLO mode, for CI/Docker).
On macOS: seatbelt (sandbox-exec) isolates Codex's shell calls to the working directory. On Linux: Landlock + seccomp. Full-access mode still respects the sandbox; danger-full-access disables it for use with external isolation (Docker, a VM).
Codex CLI's preferred file-edit format — a unified-diff-style block wrapped in *** Begin Patch / *** End Patch markers. More reliable than freeform write for large edits because it's applied atomically with full-file context.
Inline Y / N / Esc prompts for every action that exceeds the current approval mode. Batch-approve with Shift+A for the rest of the session. History of approvals recorded in ~/.codex/logs/.
Each Codex task runs in an isolated container attached to your repo. Codex has shell + file edit inside that container. The UI renders diffs, a terminal, and a file tree. You never clone the repo locally.
When Codex finishes, it offers to open a pull request directly on GitHub (or push to a branch). Diff preview is in-UI. Multiple Codex tasks can run in parallel on the same repo on separate branches.
OAuth to your GitHub account or org. Per-repo enable. Codex can read, branch, push, open PRs. Respects branch-protection rules on the GitHub side.
Fan out multiple Codex tasks from one chat — "Fix these 5 issues in parallel." Each runs in its own container + branch. Surfaces one of Codex's main advantages over local CLI agents: no contention for the working tree.
Plus / Pro / Business / Enterprise include Codex usage. Quota scales with the plan. Enterprise adds SSO + audit log + org-scoped repo access.
Async Codex task running in a sandboxed VM on OpenAI infrastructure. Kicked off from ChatGPT or API. Reports back when done — diff, PR, artefact, or failure. Good fit for 15-minute work you don't want to babysit.
Per-repo setup hook that runs once when the Codex Cloud VM is provisioned — clone, install deps, warm caches. Keeps subsequent tasks fast. Equivalent shape to a GitHub Action but scoped to Codex runs.
Per-repo env vars + secrets configured in the ChatGPT UI. Injected into the Codex VM at run time. Never echoed to chat output. For repo-scoped credentials (DB URL, test API key) that Codex needs to run the dev loop.
OAuth to your ChatGPT account. codex login opens a browser, returns a token cached at ~/.codex/auth.json. Entitlement-funded via your ChatGPT plan. No per-call cost on top of the subscription.
Alternative for scripted / CI use. OPENAI_API_KEY env var. Billed per-token against the OpenAI account. Same model behaviour; different billing + different backend (api.openai.com/v1/responses rather than chatgpt.com/backend-api/codex).
Run shell commands inside the sandbox. Subject to the current approval mode. Long-running commands stream output; timeouts configurable in config.toml.
Read or write files relative to the working directory. Write operations prefer the apply_patch format for reliability on large edits. Subject to approval in read-only mode.
Live web search. On by default in ChatGPT Codex; opt-in via config in the CLI (tools.web_search=true). Returns distilled results.
Codex CLI is an MCP client — configure external MCP servers in ~/.codex/config.toml under [mcp_servers]. Stdio + HTTP transports. Tools show up alongside built-ins in the approval prompt.
Inverse direction: expose Codex itself as an MCP server so Claude / Cursor / any MCP client can delegate coding tasks to it. Run via codex mcp. Useful for heterogeneous agent pipelines.
Unified-diff-like block: *** Begin Patch · *** Update File: path · hunks · *** End Patch. Also supports *** Add File and *** Delete File. Applied atomically. Prefer this over freeform write for multi-file changes.
User-global config. Sections: [default] (model, approval mode, working dir), [profile.<name>] (named alternates), [mcp_servers], [tools], [sandbox]. TOML not JSON.
Named config presets. codex -p fast picks the profile keyed [profile.fast]. Good for switching between models, approval modes, or MCP-server sets without editing one block.
Repo-root file read by every Codex surface on every turn. Project overview, conventions, forbidden patterns, command recipes. Same role as Claude Code's CLAUDE.md. Cascading — nested AGENTS.md files apply to their subtree.
Sandbox tuning — writable-paths allowlist, network-access toggle, timeout. Defaults are restrictive; loosen only for specific projects that need it.
Per-session transcripts. codex resume reads from here. .jsonl format — one event per line. Safe to prune old sessions.
Codex's default backing model. Agent-tuned variant of GPT-5 — trained on tool use, long-context code editing, planning. Runs behind all three surfaces unless overridden.
General-purpose GPT-5 variants (non-codex-tuned). Selectable via --model gpt-5 or a profile. Fine for lighter coding work; Codex-tuned model handles long agent runs better.
Older gpt-4o / gpt-4.1 still available via --model. Cheaper + faster for simple tasks; less reliable on long multi-step agent work.
Codex exposes a reasoning-effort dial (low / medium / high). Trades latency for depth. Default is medium. High for hard planning or debugging; low for quick edits.
Codex is built around the tool-call loop, not raw token completion. The model plans, calls tools, observes, iterates. Different mental model from calling a chat API — you steer with prompts, not by crafting a single completion.
Codex's approval gate is its safety surface. Unlike Claude Code's settings-driven allowlist, Codex prompts interactively by default and learns via session. For CI use, pin an approval mode in the profile.
Same product class, different vendor. Codex leans on sandbox-first + ChatGPT surface; Claude Code leans on hooks + plugin skills + Anthropic API. Codex has Cloud (remote parallel agents); Claude Code has stronger in-repo hook/skill conventions. Both are MCP clients + can be MCP servers.
ChatGPT-auth Codex inference is covered by the ChatGPT plan subscription — effectively zero marginal cost per call on Plus/Pro. Third-party agents that route through chatgpt.com/backend-api/codex (Hermes Pi harness, some autoresearch loops) leverage this. Tradeoff: unstable API surface, intermittent empty output.
Human-readable operator guide. Tabs: Overview · Install · CLI · ChatGPT Codex · Codex Cloud · Auth · Tools · Config · Models · Compare.
Official OpenAI Codex docs. Source of truth for behaviour changes, new models, surface changes.
Anthropic Claude Code — sibling product. Useful for comparing hook systems, skill surfaces, and CLI ergonomics across vendors.
Hermes uses Codex OAuth as its primary inference path on the Pi harness, with Gemma-4 local fallback. Real-world operational detail on the codex_responses backend's stability quirks + self-heal patches.
Duraclaw's executor adapter for Codex — shows a hosted-orchestrator pattern layering Codex behind a web UI alongside Claude Code and OpenCode.