back_office_ops · workflow

Flyte: Lyft's ML orchestration platform powers 1M+ pipelines and joins LF AI & Data

Lyft needed to orchestrate complex ML workflows for its ETA product, requiring management of large historical training datasets, complex output artifacts, backtesting, frequent retraining, and simultaneous multi-model deployment — with the largest bottleneck being infrastructure procurement and management for models that might not work out.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Historical data training trigger

Large amounts of historical data must be used to train a set of ensemble models.

Tools used

FlyteAWS Step Functionsflytekit

Outcome

By mid 2020 Flyte was powering more than 1 million pipelines at Lyft across ETA, Pricing, Mapping, Driver Engagement, Growth, and Map generation teams, and was contributed to the Linux Foundation AI & Data as its 25th hosted project.

What failed first

The initial v1 of Flyte used AWS Step Functions as its scheduler, which proved too rigid to extend with new features natively, leading the team to build a container-native scheduling engine.

Results

Volumemore than 1 million

Cost replacedtotal cost of running Flyte at Lyft was low

Running sincelate 2017

Source

https://eng.lyft.com/flyte-joins-lf-ai-data-48c9b4b60eec

How we source this →

Grounding & classification

Source type: technical build writeup

20 fields verified against source quotes.

forecastingpredictive analyticsfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedlogisticscost reductionemployee productivitythroughput increasetechnical build writeupback office ops