finance_ops · finance · workflow

MercadoLibre's Financial Data Enrichment: from handcrafted regex to LLMs and custom semantic embeddings in LATAM

MELI's transaction categorization relied on handcrafted regex rules and manually-reported MCC codes that produced frequent inconsistencies, required constant country-specific updates, and could not scale to the daily volume of new financial data across LATAM's diverse languages.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Raw transaction data arrives

Raw financial transaction data enters the enrichment pipeline to be turned into structured insights.

Tools used

GPT-3.5 TurboGPT-4o-miniGeminiClaudeBERT-stylePythonFuryDataMesh

Outcome

Adopting GPT-3.5 Turbo lifted categorization accuracy from around 60% to over 80%, cut operational costs by 75%, and scaled volume from tens of millions per quarter to tens of millions per week. Custom BERT-style embeddings then pushed accuracy to 90% with an additional cost reduction of more than 30%, a 10x increase in scalability, and near real-time processing.

What failed first

MELI's first deployed categorization model, built entirely on regex and MCC rules, was limited to debit transactions in Portuguese and proved impossible to maintain as data volume grew.

Results

Volumeover 80%

Cost replaced75%

Running sinceMarch 2023

Source

https://medium.com/mercadolibre-tech/la-nueva-babel-financiera-ense%C3%B1ar-a-la-ia-a-hablar-dinero-en-latinoam%C3%A9rica-4605235e3aac

How we source this →

Grounding & classification

Source type: technical build writeup

38 fields verified against source quotes.

data extractiondocument classificationbank statementfailure mode describedhuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedecommercefinancial servicesaccuracy improvementcost reductionthroughput increasetechnical build writeupdata entry opsfinance opsai draft human approvaldocument to recordextract classify route