quality_assurance · finance · workflow

Coinbase builds a QA AI agent to 10x testing effort at 1/10 the cost

Coinbase's manual QA testing was slow and expensive, and traditional end-to-end integration tests were prone to flakiness, causing hours of debugging from minor layout changes.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Natural language test request

A natural language prompt is sufficient to initiate a test run.

Tools used

qa-ai-agentbrowser-useMongoDBBrowserStackgRPCWebSocket

Outcome

The qa-ai-agent detects 300% more bugs in the same timeframe at 86% lower cost than manual testing, with new tests integrable in as little as 15 minutes, and now executes 40 test scenarios identifying 10 issues weekly.

What failed first

Traditional end-to-end integration tests were prone to flakiness, with minor layout adjustments causing failures that required hours of debugging.

Results

Time saved15 minutes

Volume75% (AI) vs. 80% (Manual)

Cost replaced86% reduction

Source

https://www.coinbase.com/en-nl/blog/How-We-are-Improving-Product-Quality-at-Coinbase-with-AI-agents

How we source this →

Grounding & classification

Source type: technical build writeup

36 fields verified against source quotes.

agentic workflowai agentquality inspectionfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedfinancial servicessoftwareautomation ratecost reductionemployee productivitythroughput increasetime savedtechnical build writeupquality assuranceagentic task executionautonomous resolution