quality_assurance · saas · workflow
Atlassian ML comment ranker reduces PR cycle time by 30% for Rovo Dev code reviewer agent
LLM-generated code review comments without filtering were noisy, nit-picky, or factually wrong, directly causing negative user feedback. Heuristic-based filters improved precision but sacrificed recall and were too rigid to adapt to foundation model changes or product rollouts.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · PR creation triggers agent
The code reviewer agent is triggered when a pull request is created.
Tools used
Rovo DevModernBERTHuggingFaceGPT-4oClaude Sonnet 3.5Sonnet 4Bitbucket CloudJira
Outcome
The comment ranker drove CRR to 40–45% (near the human benchmark of ~45%), reduced PR cycle time by 30%, and remained stable when the generation model switched from GPT-4o to Claude Sonnet 3.5.
What failed first
Heuristic-based filters, including LLM-based comment categorization, could not fully leverage ML and transformer architectures and lacked adaptability to upstream LLM changes and new code patterns, requiring replacement by a more holistic and scientific approach.
Results
Time saved30%
Volume40% ~ 45%
Grounding & classification
Source type: technical build writeup
35 fields verified against source quotes, 2 dropped as unverifiable.
ai agentcontent generationdocument classificationpredictive analyticsquality inspectioncode diff prbuilder submittedmetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwareaccuracy improvementcycle time reductionemployee productivitytechnical build writeupquality assuranceagentic task executionextract classify route