quality_assurance · workflow

Netflix uses neural embeddings and LSH to compress build logs from millions to thousands of lines

Netflix engineers faced build logs up to 2.5GB and 3 million lines, making manual bug-finding practically impossible; existing diff tools either produced hundreds of thousands of candidate lines or took an hour to run while still leaving 40,000 lines to review.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Failed build triggers log diff

A failed software build produces a large log file that requires diff analysis to find the bug or regression.

Tools used

Tensorflow 2.2scikit-learn NearestNeighborneural embeddingslocality sensitive hashing

Outcome

Netflix's neural embedding and LSH solution produces 20,000 candidate lines in 20 minutes, enabling engineers to review a small fraction of log output, with examples showing up to 200x log compression.

What failed first

Standard md5 diff produced hundreds of thousands of candidate lines due to character-level comparison. Fuzzy diffing with k-nearest neighbors took an hour and still yielded 40,000 lines. Neither approach handled semantic similarity between log lines.

Results

Time saved20 min

Volume20,000

Source

https://netflixtechblog.com/machine-learning-for-a-better-developer-experience-1e600c69f36c

How we source this →

Grounding & classification

Source type: technical build writeup

24 fields verified against source quotes.

anomaly detectionbuilder submittedfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedmediasoftwarecycle time reductionemployee productivitytechnical build writeupincident managementquality assurancemonitor detect alert