quality_assurance · saas · workflow

Multi-Agents: What's Actually Working — Cognition's Code-Review Loop and Smart Friend Patterns

Multi-agent systems with parallel writers produced fragile products because parallel agents made implicit conflicting decisions about style, edge cases, and code patterns, fragmenting decision-making across the system.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Coding agent writes PR

Devin, the coding agent, writes code and produces a pull request.

Tools used

DevinDevin ReviewDeepwikiWindsurfClaudeGPTMCPSonnet 4.5

Outcome

The code-review loop has Devin Review catching an average of 2 bugs per PR (roughly 58% severe), with most bugs resolved before a human opens the PR. The smart-friend pattern produced real gains in the trickiest scenarios when both models are frontier-class.

What failed first

SWE-1.5 was not capable enough to serve as the primary model in the smart-friend pattern — the gap between it and Sonnet 4.5 was too wide in knowing when to escalate and what to ask the smarter model.

Results

Time saved~8x

Volume2

Source

https://cognition.ai/blog/multi-agents-working

How we source this →

Grounding & classification

Source type: technical build writeup

30 fields verified against source quotes.

agentic workflowai agentcode generationmulti agent workflowcode diff prfailure mode describedmetric backednamed customerproduction runtime claimedtools describedvendor confirmedworkflow describedsoftwareemployee productivityerror reductiontechnical build writeupquality assuranceagentic task executionescalation workflowhuman review queue