marketing_ops · workflow

Netflix scales Match Cutting ML pipeline across its entire catalog using Amber media ML infrastructure

Netflix's media ML practitioners struggled with inconsistent media access, expensive repeated computations across independent pipelines, and bespoke triggering components that were hard to maintain — preventing the Match Cutting pipeline from scaling beyond single titles.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · New video file trigger

Amber automatically initiates scoring for new videos as soon as standardized video encodes are ready.

Tools used

AmberJasperMarkenMetaflowRayConductorMesonDagobahTitusIcebergTrinoCassandraElastic SearchSparkMezzFSS3BagginsFSx

Outcome

The Amber infrastructure standardized media access, eliminated redundant computation through feature memoization, and enabled Match Cutting to scale across the entire Netflix catalog with automatic triggering for new videos; the GPU training cluster throughput increased 3–5 times.

What failed first

The original Match Cutting pipeline lacked input file standardization causing quality issues for cross-title matching, bespoke triggering components caused unnecessary re-computation and inconsistencies, and the quadratic pair computation made scaling to cross-catalog matching computationally intractable.

Results

Volume3–5 times

Cost replacedsaving on compute costs

Source

https://netflixtechblog.com/scaling-media-machine-learning-at-netflix-f19b400243

How we source this →

Grounding & classification

Source type: technical build writeup

38 fields verified against source quotes.

computer visiondata extractionrecommendation systemproduct catalogfailure mode describedmetric backednamed customerproduction runtime claimedsource backedtools describedworkflow describedmediacost reductionemployee productivitythroughput increasetechnical build writeupback office opsmarketing opsdata sync enrichmentextract classify route