quality_assurance · saas · workflow

Multi-Agents: What's Actually Working — Cognition's Code-Review Loop and Smart Friend Patterns

Multi-agent systems with parallel writers produced fragile products because parallel agents made implicit conflicting decisions about style, edge cases, and code patterns, fragmenting decision-making across the system.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Coding agent writes PR
Devin, the coding agent, writes code and produces a pull request.
Tools used
DevinDevin ReviewDeepwikiWindsurfClaudeGPTMCPSonnet 4.5
Outcome

The code-review loop has Devin Review catching an average of 2 bugs per PR (roughly 58% severe), with most bugs resolved before a human opens the PR. The smart-friend pattern produced real gains in the trickiest scenarios when both models are frontier-class.

What failed first

SWE-1.5 was not capable enough to serve as the primary model in the smart-friend pattern — the gap between it and Sonnet 4.5 was too wide in knowing when to escalate and what to ask the smarter model.

Results
Time saved~8x
Volume2
Source

https://cognition.ai/blog/multi-agents-working

How we source this →

Grounding & classification
Source type: technical build writeup
30 fields verified against source quotes.
agentic workflowai agentcode generationmulti agent workflowcode diff prfailure mode describedmetric backednamed customerproduction runtime claimedtools describedvendor confirmedworkflow describedsoftwareemployee productivityerror reductiontechnical build writeupquality assuranceagentic task executionescalation workflowhuman review queue