back_office_ops · saas · workflow

Yelp scales LLM-based search query understanding to millions of daily searches in production

Yelp's pre-LLM query understanding systems were fragmented — several different systems stitched together — and often lacked the intelligence to handle nuanced user intent, spelling errors, ambiguous locations, and complex semantic phrase expansions.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · User search query submitted
A user enters a search query and the system must understand their intent.
Tools used
LLMsGPT-4GPT3RAGo1-minio1-previewGPT4o-miniBERTT5
Outcome

Yelp successfully deployed LLM-based query understanding to production, scaling review highlights to 95% of traffic through pre-computed results, achieving up to 100x cost savings compared to direct GPT-4 usage, and increasing Session / Search CTR across platforms.

What failed first

Traditional Named Entity Recognition and text similarity models could not handle the nuances of Yelp's query understanding needs, including multi-concept queries, ambiguous locations, and creative semantic expansions.

Results
Volume95%
Cost replacedup to a 100x savings in cost
Source

https://engineeringblog.yelp.com/2025/02/search-query-understanding-with-LLMs.html

How we source this →

Grounding & classification
Source type: technical build writeup
33 fields verified against source quotes.
content generationdata extractionragknowledge basehuman review describedmetric backednamed customerproduction runtime claimedsource backedtools describedworkflow describedsoftwareaccuracy improvementconversion increasecost reductionthroughput increasetechnical build writeupback office opsextract classify routerag answering