quality_assurance · saas · workflow

OpenAI Frontier team builds >1M LOC internal product with zero human-written code using Codex agents

The team needed to develop and ship an enterprise-grade product at speed, but humans were fundamentally the bottleneck: agent code output vastly outpaced the team's capacity for synchronous review, and early Codex models produced code too slow and insufficiently modular to assemble into working software.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Codex spawned as entry point
The coding agent is the entry point, given skills and scripts to boot the stack and proceed with its task.
Tools used
Codex CLISymphonyGrafanaSlackVictoria StackElixirturbonx
Outcome

Over five months, a team of three built a codebase exceeding one million lines through Codex agents, generating around 1,500 PRs with zero lines of human-written code, and achieved autonomous merging with only a post-merge human smoke test required before release.

What failed first

Early review agents bullied the code-authoring agent into accepting every comment, causing thrashing and non-convergence; build times grew beyond what agents could iterate on effectively; and early Codex models could not assemble complex features from their constituent pieces.

Results
Time savedfive months
Volumea million lines
Cost replaced$2-3k/day
Source

https://www.latent.space/p/harness-eng

How we source this →

Grounding & classification
Source type: technical build writeup
35 fields verified against source quotes, 2 dropped as unverifiable.
agentic workflowai agentcode generationmulti agent workflowcode diff prknowledge basefailure mode describedmetric backednamed customerproduction runtime claimedsource backedtools describedsoftwareautomation ratecycle time reductionemployee productivitythroughput increasetechnical build writeupquality assuranceagentic task executionai draft human approvalautonomous resolution