logistics_ops · workflow

DoorDash builds a centralized ML platform quadrupling model count and achieving 5x prediction throughput

DoorDash's rapid hypergrowth required a centralized ML platform to abstract infrastructure complexity for data scientists, and the existing prediction service could not keep pace with surging food order volumes during COVID-19.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Data scientist initiates ML workflow

Data scientists move through a highly iterative ML development process requiring ongoing experimentation across multiple steps.

Tools used

SibylRedisgRPC

Outcome

DoorDash's ML platform quadrupled the number of models and achieved 5x prediction throughput; a feature store optimization cut costs three-fold and reduced feature fetching latencies by 38%, while the platform now handles billions of predictions per day.

What failed first

The manual model testing process—requiring data scientists to hand-write Python gRPC scripts per migration—was not scalable as the team grew, and early feature quality monitoring required an onboarding step that hindered adoption.

Results

Volume38%

Cost replacedreduced costs three-fold

Source

https://careersatdoordash.com/blog/3-principles-for-building-an-ml-platform/

How we source this →

Grounding & classification

Source type: technical build writeup

25 fields verified against source quotes.

anomaly detectionpredictive analyticsrecommendation systemfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedecommercelogisticscost reductioncycle time reductionemployee productivitythroughput increasetechnical build writeupback office opslogistics opsmonitor detect alert