data_entry_ops · education · workflow
Nanonets AI extracts data from 140,000+ handwritten historical documents with 95% accuracy for SciencePo researcher
Florian Oswald needed to extract land usage and value data from over 140,000 handwritten historical documents with non-standard, intersecting table formats. Traditional OCR tools could not handle such formats, and manual processing would have taken months.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Large-scale document extraction needed
The researcher needed to extract land usage and value data from 140,000+ handwritten documents.
Tools used
Nanonets AIAutomated Workflows
Outcome
The researcher extracted data with over 95% accuracy in just 2 hours using Nanonets AI, and was able to get started within a day.
What failed first
Traditional OCR tools were unable to capture data from the non-standard table format where rows and columns intersected.
Results
Time saved2 hours
Volumeover 95%
Grounding & classification
Source type: vendor customer story
19 fields verified against source quotes.
data extractiondocument aimetric backednamed customertools describedvendor confirmedworkflow describededucationaccuracy improvementtime savedvendor customer storydata entry opsdocument to record