ecommerce_ops · ecommerce · workflow

Evolution and Scale of Uber Eats' Multilingual Semantic Search Platform

Uber Eats' lexical search stack could not handle real-world query complexity—synonyms, typos, shorthand, multilingual terms, and context-dependent words—causing missed intent and poor results for a large portion of user searches.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · User types search query

A large share of orders start with people typing into the search bar to find stores, dishes, and grocery items.

Tools used

QwenPyTorchHugging Face TransformersRayDeepSpeedHNSWfeature store

Outcome

Uber Eats built a production semantic search system that powers multilingual discovery across restaurants, grocery, and retail, achieving a 34% latency reduction and 17% CPU savings through k-tuning, more than halving latency with scalar quantization while maintaining recall above 0.95, and reducing storage costs by nearly 50% with MRL embeddings.

What failed first

Traditional lexical matching was effective only when queries exactly matched document text, but produced bad search results for the broad range of real-world queries Uber Eats receives.

Results

Volume34%

Cost replacednearly 50%

Source

https://www.uber.com/en-GB/blog/evolution-and-scale-of-ubers-delivery-search-platform/?uclick_id=0a73d271-32e7-4b77-9697-a587a4c8d9fe

How we source this →

Grounding & classification

Source type: technical build writeup

30 fields verified against source quotes, 1 dropped as unverifiable.

enterprise searchproduct catalogfailure mode describedmetric backednamed customerproduction runtime claimedsource backedtools describedworkflow describedecommercesoftwareaccuracy improvementcost reductioncycle time reductiontechnical build writeupecommerce opsautonomous resolutiondata sync enrichment