quality_assurance · saas · workflow

Cloudflare builds CI-native multi-agent AI code review system across 48,095 merge requests

Code review was reliably bottlenecking Cloudflare's engineering teams, with a median wait time for a first review measured in hours. Off-the-shelf AI code review tools lacked the flexibility and customisation required at Cloudflare's scale, and a naive approach of stuffing diffs into a large language model produced a flood of vague, noisy output.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Merge request opens review
When an engineer at Cloudflare opens a merge request, it gets an initial pass from a coordinated set of AI agents.
Tools used
OpenCodeClaude Opus 4.7GPT-5.4Claude Sonnet 4.6GPT-5.3 CodexKimi K2.5GitLab · partnerCloudflare WorkerWorkers KVBunPrometheusWorkers LoggingVault
Outcome

In its first month the system completed 131,246 review runs across 48,095 merge requests in 5,169 repositories, with a median review time of 3 minutes and 39 seconds, an average cost of $1.19, an 85.7% prompt cache hit rate, and engineers needing to break glass on only 0.6% of merge requests.

What failed first

Commercial AI code review tools were insufficiently configurable for a large engineering organization. A naive single-prompt LLM approach of grabbing a git diff and asking a model to find bugs produced a flood of vague suggestions, hallucinated syntax errors, and redundant advice.

Results
Time saved131,246
Volume48,095
Cost replaced$1.19
Running sinceMarch 10, 2026
Source

https://blog.cloudflare.com/ai-code-review/

How we source this →

Grounding & classification
Source type: technical build writeup
59 fields verified against source quotes, 1 dropped as unverifiable.
agentic workflowai agentmulti agent workflowsummarizationcode diff prbuilder submittedfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwareautomation ratecost reductioncycle time reductionemployee productivitythroughput increasetechnical build writeupquality assuranceagentic task executionescalation workflowextract classify route