customer_support · workflow

How Uber's Michelangelo ML platform evolved from predictive models to generative AI at scale

Before Michelangelo, Uber's ML development was fragmented: applied scientists used Jupyter Notebooks and engineers built bespoke deployment pipelines with no system for reliable reproducible workflows, no easy way to compare training experiments, and no established path to production deployment.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Unified ML project initiation

MA Studio provides a simplified user flow covering every step of the ML journey from feature/data prep through production performance monitoring.

Tools used

MichelangeloPaletteHorovodRayTensorFlowPyTorchXGBoostSparkTritonKubernetesCanvasPyMLGalleryManifoldNeuropodDockerBazelJupyter NotebooksetcdPelotonMesos

Outcome

Michelangelo now manages approximately 400 active ML projects with over 20K model training jobs monthly, more than 5K models in production serving 10 million real-time predictions per second at peak, and deep learning adoption in tier-1 projects increased from almost zero to more than 60%.

What failed first

Michelangelo 1.0 had four structural gaps: no comprehensive ML quality definition or project tiering, insufficient deep learning support, inadequate collaborative model development capabilities, and fragmented ML tooling forcing developers to constantly switch between semi-isolated systems.

Results

Time savedover 20K

Volumeapproximately 400

Running sinceearly 2016

Source

https://www.uber.com/us/en/blog/from-predictive-to-generative-ai/

How we source this →

Grounding & classification

Source type: technical build writeup

56 fields verified against source quotes, 2 dropped as unverifiable.

anomaly detectioncomputer visionconversational aiforecastingpredictive analyticsrecommendation systembuilder submittedmetric backednamed customerproduction runtime claimedtools describedworkflow describedlogisticstravelautomation rateemployee productivitythroughput increasetechnical build writeupback office opscustomer supportdata sync enrichmentmonitor detect alert