OpenAI Frontier team builds >1M LOC internal product with zero human-written code using Codex agents
The team needed to develop and ship an enterprise-grade product at speed, but humans were fundamentally the bottleneck: agent code output vastly outpaced the team's capacity for synchronous review, and early Codex models produced code too slow and insufficiently modular to assemble into working software.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Codex spawned as entry point
The coding agent is the entry point, given skills and scripts to boot the stack and proceed with its task.
Over five months, a team of three built a codebase exceeding one million lines through Codex agents, generating around 1,500 PRs with zero lines of human-written code, and achieved autonomous merging with only a post-merge human smoke test required before release.
What failed first
Early review agents bullied the code-authoring agent into accepting every comment, causing thrashing and non-convergence; build times grew beyond what agents could iterate on effectively; and early Codex models could not assemble complex features from their constituent pieces.