data_entry_ops · education · workflow

Leading EdTech enterprise scales ML training data annotation with Labelbox

Rapid pandemic growth created demand for hundreds of thousands of annotated text data items to train ML models, but the company's existing tools — Amazon SageMaker GroundTruth, Prodigy, and an in-house labeling tool — could not scale to meet that demand.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Student screenshot responses

Students answering questions from their software were typically done in the form of screenshot images.

Tools used

LabelboxWorkforceOCRAmazon SageMaker GroundTruthProdigy

Outcome

The company rapidly delivered hundreds of thousands of annotations in record speed while gaining full visibility into labeler productivity and training data quality; AI and ML also made the question-answering process smoother and helped speed up expert workflows.

What failed first

Previous annotation services including Amazon SageMaker GroundTruth and Prodigy, as well as an in-house tool, lacked transparency and offered no ability to revisit submitted labels, fix errors, or track labeler productivity — making them feel like a black box.

Results

Volumehundreds of thousands of annotations

Source

https://labelbox.com/customers/ner-edtech/

How we source this →

Grounding & classification

Source type: vendor customer story

28 fields verified against source quotes.

ocrquality inspectionrecommendation systemform submissionknowledge basefailure mode describedhuman review describedmetric backedproduction runtime claimedtools describedvendor confirmedworkflow describededucationemployee productivitythroughput increasetime savedvendor customer storydata entry opsquality assurancedocument to recordhuman review queue