data_entry_ops · logistics · workflow

DoorDash builds LLM guardrail system to automate restaurant menu transcription from photos

DoorDash previously relied on humans to manually transcribe restaurant menus from photos, a process described as costly and time-consuming. LLMs alone could not achieve the required high accuracy due to diverse menu structures, incomplete menus, and low-quality photos.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Menu photo submitted

Restaurant partners submit menu photos to initiate the transcription workflow.

Tools used

OCRLightGBMResNetDiTCNN

Outcome

DoorDash deployed a partial automation pipeline combining LLM transcription with an ML guardrail model that routes high-confidence transcriptions to production automatically and low-confidence ones to human review, improving efficiency without sacrificing quality and enabling rapid adoption of new AI models.

What failed first

LLMs used as standalone transcription tools produced errors due to inconsistent menu structures, incomplete menus, and low photo quality. Intensive efforts to improve LLM accuracy still required too much time and investment to meet production standards.

Source

https://careersatdoordash.com/blog/doordash-llm-transcribe-menu/

How we source this →

Grounding & classification

Source type: technical build writeup

29 fields verified against source quotes, 3 dropped as unverifiable.

computer visiondata extractionidpocrquality inspectionsummarizationproduct catalogfailure mode describedhuman review describednamed customerproduction runtime claimedsource backedtools describedworkflow describedecommercelogisticsautomation ratetime savedtechnical build writeupdata entry opsecommerce opsdocument to recordhuman review queue