data_entry_ops · saas · workflow

Ancestry uses Labelbox to achieve weekly ML model iteration cycles for genealogical data

Ancestry's data science team needed more efficient ways to decode census data and train ML models faster. Domain experts had strong knowledge of historical documents but limited insight into model behavior, making it hard to connect their expertise to the labeling process. Getting data labeled and reviewed was extremely slow.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Census data labeling need

The data science team was looking for more efficient ways to decode census data and to train their ML models faster.

Tools used

LabelboxBoost labeling services

Outcome

With Labelbox, Ancestry achieved a weekly model iteration cycle and can now maintain training data quality, save time collaborating with domain experts, and train and test new models in record time.

What failed first

On other labeling platforms, the process was opaque — teams had to wait for all labels to come back before reviewing or giving feedback, with no ability to intervene or clarify mid-process.

Results

Time savedrecord time

Source

https://labelbox.com/customers/ancestry-customer-story

How we source this →

Grounding & classification

Source type: vendor customer story

24 fields verified against source quotes, 1 dropped as unverifiable.

document aiocrquality inspectionknowledge basefailure mode describedhuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwareaccuracy improvementcycle time reductionemployee productivityvendor customer storydata entry opsquality assuranceai draft human approvalhuman review queue