back_office_ops · workflow

Lyft's ML Feature Serving Infrastructure: Unified Batch and Streaming Feature Access for Training and Online Inference

Lyft's ML models require features computed via both batch jobs on the data warehouse and real-time event streams, and those features must be accessible in two modes: batch queries for model training and low-latency point lookups for online inference — a dual-access requirement that needed a unified infrastructure solution.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · SQL Feature Definition
ML practitioners define features using SQL, with one column designated as an entity ID and the remaining columns as features.
Tools used
FlyteFlinkDynamoDBHiveRedisElasticsearchKafka
Outcome

The Feature Service has been widely adopted across Lyft teams since Q4 2017, hosting thousands of features across many ML models, serving millions of requests per minute with single-digit millisecond latency and 99.99%+ availability.

Results
Time savedmillions of requests per minute
Volume99.99%+
Running sinceQ4 2017
Source

https://eng.lyft.com/ml-feature-serving-infrastructure-at-lyft-d30bf2d3c32a

How we source this →

Grounding & classification
Source type: technical build writeup
23 fields verified against source quotes.
fraud detectionpredictive analyticsmetric backednamed customerproduction runtime claimedtools describedworkflow describedlogisticssoftwarethroughput increasetechnical build writeupback office opsdata sync enrichment