Workflow · media · workflow
FM-Intent: Predicting User Session Intent with Hierarchical Multi-Task Learning at Netflix
Netflix's foundation model focused on next-item prediction but lacked the ability to capture or leverage underlying user session intent, and existing intent prediction approaches did not establish a hierarchical relationship between intent and item prediction tasks.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Input feature construction
The input feature for each interaction combines categorical embeddings and numerical features to create a comprehensive representation of user behavior.
Tools used
Transformer encoderK-means++
Outcome
FM-Intent achieves a statistically significant 7.4% improvement in next-item prediction accuracy over the best baseline and has been successfully integrated into Netflix's recommendation ecosystem.
What failed first
Prior approaches to intent prediction used simple multi-task learning without a hierarchical structure, and most baseline models either could not predict user intent or could not incorporate intent predictions into next-item recommendations.
Results
Volume7.4%
Grounding & classification
Source type: technical build writeup
14 fields verified against source quotes.
personalizationpredictive analyticsrecommendation systemmetric backednamed customerproduction runtime claimedsource backedtools describedworkflow describedmediaaccuracy improvementtechnical build writeup