DoorDash uses LLMs and RAG to build a Product Knowledge Graph and supercharge search for New Verticals
DoorDash's expansion into new verticals — groceries, alcohol, and retail — created the challenge of handling hundreds of thousands of SKUs requiring accurate product attribute extraction and catalog management. Traditional human annotation for training ML models was time-consuming and expensive, and a cold start problem made it difficult to quickly launch model coverage for new product categories.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Create golden annotations
A small set of high-quality, manually labeled annotations — known as golden annotations — is created for new categories or products.
Tools used
Large Language ModelsRAGNLPRayLoRAQLoRA
Outcome
LLM-assisted annotation significantly reduced costs and enabled model training in days instead of weeks. The approach enhanced product discovery, enabled quicker adaptation to new verticals, and delivered a smoother, more trustworthy shopping experience while reducing human annotator workload.
What failed first
Traditional human annotation workflows were expensive and slow, unable to scale quickly to new product categories. Engagement-based training signals for search were noisy and sparse for niche tail queries, limiting search relevance model quality.