logistics_ops · workflow

DoorDash builds a centralized ML platform quadrupling model count and achieving 5x prediction throughput

DoorDash's rapid hypergrowth required a centralized ML platform to abstract infrastructure complexity for data scientists, and the existing prediction service could not keep pace with surging food order volumes during COVID-19.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Data scientist initiates ML workflow
Data scientists move through a highly iterative ML development process requiring ongoing experimentation across multiple steps.
Tools used
SibylRedisgRPC
Outcome

DoorDash's ML platform quadrupled the number of models and achieved 5x prediction throughput; a feature store optimization cut costs three-fold and reduced feature fetching latencies by 38%, while the platform now handles billions of predictions per day.

What failed first

The manual model testing process—requiring data scientists to hand-write Python gRPC scripts per migration—was not scalable as the team grew, and early feature quality monitoring required an onboarding step that hindered adoption.

Results
Volume38%
Cost replacedreduced costs three-fold
Source

https://careersatdoordash.com/blog/3-principles-for-building-an-ml-platform/

How we source this →

Grounding & classification
Source type: technical build writeup
25 fields verified against source quotes.
anomaly detectionpredictive analyticsrecommendation systemfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedecommercelogisticscost reductioncycle time reductionemployee productivitythroughput increasetechnical build writeupback office opslogistics opsmonitor detect alert