data_entry_ops · education · workflow

Nanonets AI extracts data from 140,000+ handwritten historical documents with 95% accuracy for SciencePo researcher

Florian Oswald needed to extract land usage and value data from over 140,000 handwritten historical documents with non-standard, intersecting table formats. Traditional OCR tools could not handle such formats, and manual processing would have taken months.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Large-scale document extraction needed

The researcher needed to extract land usage and value data from 140,000+ handwritten documents.

Tools used

Nanonets AIAutomated Workflows

Outcome

The researcher extracted data with over 95% accuracy in just 2 hours using Nanonets AI, and was able to get started within a day.

What failed first

Traditional OCR tools were unable to capture data from the non-standard table format where rows and columns intersected.

Results

Time saved2 hours

Volumeover 95%

Source

https://nanonets.com/customer-success-story/extracting-handwritten-documents-using-nanonets

How we source this →

Grounding & classification

Source type: vendor customer story

19 fields verified against source quotes.

data extractiondocument aimetric backednamed customertools describedvendor confirmedworkflow describededucationaccuracy improvementtime savedvendor customer storydata entry opsdocument to record