quality_assurance · saas · workflow

Cloudflare builds CI-native multi-agent AI code review system across 48,095 merge requests

Code review was reliably bottlenecking Cloudflare's engineering teams, with a median wait time for a first review measured in hours. Off-the-shelf AI code review tools lacked the flexibility and customisation required at Cloudflare's scale, and a naive approach of stuffing diffs into a large language model produced a flood of vague, noisy output.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Merge request opens review

When an engineer at Cloudflare opens a merge request, it gets an initial pass from a coordinated set of AI agents.

Tools used

OpenCodeClaude Opus 4.7GPT-5.4Claude Sonnet 4.6GPT-5.3 CodexKimi K2.5GitLab · partnerCloudflare WorkerWorkers KVBunPrometheusWorkers LoggingVault

Outcome

In its first month the system completed 131,246 review runs across 48,095 merge requests in 5,169 repositories, with a median review time of 3 minutes and 39 seconds, an average cost of $1.19, an 85.7% prompt cache hit rate, and engineers needing to break glass on only 0.6% of merge requests.

What failed first

Commercial AI code review tools were insufficiently configurable for a large engineering organization. A naive single-prompt LLM approach of grabbing a git diff and asking a model to find bugs produced a flood of vague suggestions, hallucinated syntax errors, and redundant advice.

Results

Time saved131,246

Volume48,095

Cost replaced$1.19

Running sinceMarch 10, 2026

Source

https://blog.cloudflare.com/ai-code-review/

How we source this →

Grounding & classification

Source type: technical build writeup

59 fields verified against source quotes, 1 dropped as unverifiable.

agentic workflowai agentmulti agent workflowsummarizationcode diff prbuilder submittedfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwareautomation ratecost reductioncycle time reductionemployee productivitythroughput increasetechnical build writeupquality assuranceagentic task executionescalation workflowextract classify route