Leading EdTech enterprise scales ML training data annotation with Labelbox
Rapid pandemic growth created demand for hundreds of thousands of annotated text data items to train ML models, but the company's existing tools — Amazon SageMaker GroundTruth, Prodigy, and an in-house labeling tool — could not scale to meet that demand.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Student screenshot responses
Students answering questions from their software were typically done in the form of screenshot images.
The company rapidly delivered hundreds of thousands of annotations in record speed while gaining full visibility into labeler productivity and training data quality; AI and ML also made the question-answering process smoother and helped speed up expert workflows.
What failed first
Previous annotation services including Amazon SageMaker GroundTruth and Prodigy, as well as an in-house tool, lacked transparency and offered no ability to revisit submitted labels, fix errors, or track labeler productivity — making them feel like a black box.