Workflow · media · workflow

FM-Intent: Predicting User Session Intent with Hierarchical Multi-Task Learning at Netflix

Netflix's foundation model focused on next-item prediction but lacked the ability to capture or leverage underlying user session intent, and existing intent prediction approaches did not establish a hierarchical relationship between intent and item prediction tasks.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Input feature construction
The input feature for each interaction combines categorical embeddings and numerical features to create a comprehensive representation of user behavior.
Tools used
Transformer encoderK-means++
Outcome

FM-Intent achieves a statistically significant 7.4% improvement in next-item prediction accuracy over the best baseline and has been successfully integrated into Netflix's recommendation ecosystem.

What failed first

Prior approaches to intent prediction used simple multi-task learning without a hierarchical structure, and most baseline models either could not predict user intent or could not incorporate intent predictions into next-item recommendations.

Results
Volume7.4%
Source

https://netflixtechblog.com/fm-intent-predicting-user-session-intent-with-hierarchical-multi-task-learning-94c75e18f4b8

How we source this →

Grounding & classification
Source type: technical build writeup
14 fields verified against source quotes.
personalizationpredictive analyticsrecommendation systemmetric backednamed customerproduction runtime claimedsource backedtools describedworkflow describedmediaaccuracy improvementtechnical build writeup