ecommerce_ops · ecommerce · workflow

DoorDash uses LLMs and RAG to build a Product Knowledge Graph and supercharge search for New Verticals

DoorDash's expansion into new verticals — groceries, alcohol, and retail — created the challenge of handling hundreds of thousands of SKUs requiring accurate product attribute extraction and catalog management. Traditional human annotation for training ML models was time-consuming and expensive, and a cold start problem made it difficult to quickly launch model coverage for new product categories.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Create golden annotations
A small set of high-quality, manually labeled annotations — known as golden annotations — is created for new categories or products.
Tools used
Large Language ModelsRAGNLPRayLoRAQLoRA
Outcome

LLM-assisted annotation significantly reduced costs and enabled model training in days instead of weeks. The approach enhanced product discovery, enabled quicker adaptation to new verticals, and delivered a smoother, more trustworthy shopping experience while reducing human annotator workload.

What failed first

Traditional human annotation workflows were expensive and slow, unable to scale quickly to new product categories. Engagement-based training signals for search were noisy and sparse for niche tail queries, limiting search relevance model quality.

Results
Time savedtrain models in days instead of weeks
Cost replacedsignificantly reducing costs
Source

https://careersatdoordash.com/blog/unleashing-the-power-of-large-language-models-at-doordash-for-a-seamless-shopping-adventure/

How we source this →

Grounding & classification
Source type: technical build writeup
32 fields verified against source quotes.
data extractiondocument classificationknowledge searchpersonalizationragknowledge baseproduct catalogmetric backednamed customerproduction runtime claimedtools describedworkflow describedecommercelogisticsaccuracy improvementcost reductionemployee productivitytime savedtechnical build writeupdata entry opsecommerce opsquality assurancedata sync enrichmentextract classify route