marketing_ops · workflow

Netflix scales Match Cutting ML pipeline across its entire catalog using Amber media ML infrastructure

Netflix's media ML practitioners struggled with inconsistent media access, expensive repeated computations across independent pipelines, and bespoke triggering components that were hard to maintain — preventing the Match Cutting pipeline from scaling beyond single titles.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · New video file trigger
Amber automatically initiates scoring for new videos as soon as standardized video encodes are ready.
Tools used
AmberJasperMarkenMetaflowRayConductorMesonDagobahTitusIcebergTrinoCassandraElastic SearchSparkMezzFSS3BagginsFSx
Outcome

The Amber infrastructure standardized media access, eliminated redundant computation through feature memoization, and enabled Match Cutting to scale across the entire Netflix catalog with automatic triggering for new videos; the GPU training cluster throughput increased 3–5 times.

What failed first

The original Match Cutting pipeline lacked input file standardization causing quality issues for cross-title matching, bespoke triggering components caused unnecessary re-computation and inconsistencies, and the quadratic pair computation made scaling to cross-catalog matching computationally intractable.

Results
Volume3–5 times
Cost replacedsaving on compute costs
Source

https://netflixtechblog.com/scaling-media-machine-learning-at-netflix-f19b400243

How we source this →

Grounding & classification
Source type: technical build writeup
38 fields verified against source quotes.
computer visiondata extractionrecommendation systemproduct catalogfailure mode describedmetric backednamed customerproduction runtime claimedsource backedtools describedworkflow describedmediacost reductionemployee productivitythroughput increasetechnical build writeupback office opsmarketing opsdata sync enrichmentextract classify route