ecommerce_ops · ecommerce · workflow

DoorDash uses LLMs and RAG to build a Product Knowledge Graph and supercharge search for New Verticals

DoorDash's expansion into new verticals — groceries, alcohol, and retail — created the challenge of handling hundreds of thousands of SKUs requiring accurate product attribute extraction and catalog management. Traditional human annotation for training ML models was time-consuming and expensive, and a cold start problem made it difficult to quickly launch model coverage for new product categories.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Create golden annotations

A small set of high-quality, manually labeled annotations — known as golden annotations — is created for new categories or products.

Tools used

Large Language ModelsRAGNLPRayLoRAQLoRA

Outcome

LLM-assisted annotation significantly reduced costs and enabled model training in days instead of weeks. The approach enhanced product discovery, enabled quicker adaptation to new verticals, and delivered a smoother, more trustworthy shopping experience while reducing human annotator workload.

What failed first

Traditional human annotation workflows were expensive and slow, unable to scale quickly to new product categories. Engagement-based training signals for search were noisy and sparse for niche tail queries, limiting search relevance model quality.

Results

Time savedtrain models in days instead of weeks

Cost replacedsignificantly reducing costs

Source

https://careersatdoordash.com/blog/unleashing-the-power-of-large-language-models-at-doordash-for-a-seamless-shopping-adventure/

How we source this →

Grounding & classification

Source type: technical build writeup

32 fields verified against source quotes.

data extractiondocument classificationknowledge searchpersonalizationragknowledge baseproduct catalogmetric backednamed customerproduction runtime claimedtools describedworkflow describedecommercelogisticsaccuracy improvementcost reductionemployee productivitytime savedtechnical build writeupdata entry opsecommerce opsquality assurancedata sync enrichmentextract classify route