quality_assurance · workflow

Duolingo uses machine learning to prioritize learner-submitted course fix reports

About 90% of learner-submitted translation fix reports contain errors, making it difficult for contributors to efficiently identify the roughly 10% that are valid and require action.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Learner submits report

Learners submit a 'Report' flag when their answer is not accepted, queuing it in the backend for contributor review.

Tools used

logistic regressionDuolingo Incubator

Outcome

The ML ranking system outperformed the frequency-based approach on every single course, including low-resource languages, and enabled new courses such as Arabic, Latin, and Scottish Gaelic to reach stable low problem-report rates within weeks of launch.

What failed first

The previous approach of sorting reports by how often each answer was submitted (wisdom of the crowd) performed only slightly above random, achieving an AUC of about 0.59.

Results

Volume10%

Running sinceearly 2019

Source

https://blog.duolingo.com/how-machine-learning-helps-duolingo-prioritize-course-improvements/

How we source this →

Grounding & classification

Source type: technical build writeup

22 fields verified against source quotes.

document classificationpredictive analyticsform submissionfailure mode describedhuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describededucationaccuracy improvementemployee productivitytechnical build writeupquality assurancehuman review queue