quality_assurance · saas · workflow

Cloudflare builds a multi-agent vulnerability research harness with Anthropic's Mythos Preview

Triaging security vulnerabilities at scale is hard: deciding which bugs are real and exploitable wastes analyst time, AI vulnerability scanners made the noise problem worse, and generic coding agents lack the context and throughput to meaningfully cover large codebases.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Recon: map attack surface
An agent reads the repository top-down, fans out to subagents per subsystem, and produces an architecture document covering build commands, trust boundaries, entry points, and likely attack surface.
Tools used
Mythos Preview
Outcome

A multi-agent harness built around Mythos Preview produces noticeably higher quality findings with fewer hedged results, and can chain low-severity bugs into working proofs of concept, turning speculative findings into actionable ones.

What failed first

Using generic coding agents for vulnerability research produced findings but not meaningful coverage. Previous frontier models identified bugs but could not chain them into working exploits, leaving exploitability an open question. Letting the model write its own patches introduced new regressions.

Results
Running sincelast year
Source

https://blog.cloudflare.com/cyber-frontier-models/

How we source this →

Grounding & classification
Source type: technical build writeup
21 fields verified against source quotes.
agentic workflowcode generationmulti agent workflowcode diff prknowledge basefailure mode describedhuman review describednamed customerproduction runtime claimedtools describedworkflow describedsoftwareaccuracy improvementemployee productivitytechnical build writeupquality assuranceagentic task execution