quality_assurance · workflow

Duolingo uses machine learning to prioritize learner-submitted course fix reports

About 90% of learner-submitted translation fix reports contain errors, making it difficult for contributors to efficiently identify the roughly 10% that are valid and require action.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Learner submits report
Learners submit a 'Report' flag when their answer is not accepted, queuing it in the backend for contributor review.
Tools used
logistic regressionDuolingo Incubator
Outcome

The ML ranking system outperformed the frequency-based approach on every single course, including low-resource languages, and enabled new courses such as Arabic, Latin, and Scottish Gaelic to reach stable low problem-report rates within weeks of launch.

What failed first

The previous approach of sorting reports by how often each answer was submitted (wisdom of the crowd) performed only slightly above random, achieving an AUC of about 0.59.

Results
Volume10%
Running sinceearly 2019
Source

https://blog.duolingo.com/how-machine-learning-helps-duolingo-prioritize-course-improvements/

How we source this →

Grounding & classification
Source type: technical build writeup
22 fields verified against source quotes.
document classificationpredictive analyticsform submissionfailure mode describedhuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describededucationaccuracy improvementemployee productivitytechnical build writeupquality assurancehuman review queue