customer_support · workflow

How Uber's Michelangelo ML platform evolved from predictive models to generative AI at scale

Before Michelangelo, Uber's ML development was fragmented: applied scientists used Jupyter Notebooks and engineers built bespoke deployment pipelines with no system for reliable reproducible workflows, no easy way to compare training experiments, and no established path to production deployment.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Unified ML project initiation
MA Studio provides a simplified user flow covering every step of the ML journey from feature/data prep through production performance monitoring.
Tools used
MichelangeloPaletteHorovodRayTensorFlowPyTorchXGBoostSparkTritonKubernetesCanvasPyMLGalleryManifoldNeuropodDockerBazelJupyter NotebooksetcdPelotonMesos
Outcome

Michelangelo now manages approximately 400 active ML projects with over 20K model training jobs monthly, more than 5K models in production serving 10 million real-time predictions per second at peak, and deep learning adoption in tier-1 projects increased from almost zero to more than 60%.

What failed first

Michelangelo 1.0 had four structural gaps: no comprehensive ML quality definition or project tiering, insufficient deep learning support, inadequate collaborative model development capabilities, and fragmented ML tooling forcing developers to constantly switch between semi-isolated systems.

Results
Time savedover 20K
Volumeapproximately 400
Running sinceearly 2016
Source

https://www.uber.com/us/en/blog/from-predictive-to-generative-ai/

How we source this →

Grounding & classification
Source type: technical build writeup
56 fields verified against source quotes, 2 dropped as unverifiable.
anomaly detectioncomputer visionconversational aiforecastingpredictive analyticsrecommendation systembuilder submittedmetric backednamed customerproduction runtime claimedtools describedworkflow describedlogisticstravelautomation rateemployee productivitythroughput increasetechnical build writeupback office opscustomer supportdata sync enrichmentmonitor detect alert