quality_assurance · saas · workflow

Mozilla hardens Firefox by fixing 271 latent security bugs with Claude Mythos Preview

Firefox contained latent security bugs that were notoriously difficult to find with traditional fuzzing, particularly sandbox escapes in the multiprocess browser engine that required complex reasoning to discover.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Prompt harness to find bug

The harness is prompted with the instruction to find a bug in a specific part of the code and build a test case.

Tools used

Claude Mythos PreviewClaude Opus 4.6GPT 4Sonnet 3.5AddressSanitizer

Outcome

Mozilla identified and fixed 271 previously-unknown vulnerabilities using Claude Mythos Preview in Firefox 150, including 180 sec-high and 80 sec-moderate bugs, with 423 total security bugs fixed in April releases.

What failed first

Early LLM code audit experiments using GPT 4 and Sonnet 3.5 for static analysis of high-risk code showed some promise but produced a high rate of false positives that made scaling impractical, and AI-generated security reports to open source projects broadly were regarded as unwanted noise.

Results

Volume423

Cost replaced271

Running sinceFebruary

Source

https://hacks.mozilla.org/2026/05/behind-the-scenes-hardening-firefox/

How we source this →

Grounding & classification

Source type: technical build writeup

30 fields verified against source quotes.

agentic workflowai agentcode generationcode diff prfailure mode describedhuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwareerror reductionthroughput increasetechnical build writeupquality assuranceagentic task executionmonitor detect alert