Shopify builds Vision Language Model product classification system processing 30M daily predictions
Shopify's early product classification systems struggled with increasing product complexity and diversity on the platform. By early 2023, the team identified unmet requirements including more granular product understanding, consistent taxonomy, attribute extraction, richer metadata, and content safety features that category-only classification could not satisfy.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Dynamic product request batching
Incoming product requests are dynamically grouped based on real-time arrival patterns rather than fixed batch sizes.
Tools used
Vision Language ModelsLlaVA 1.5 7BLLaMA 3.2 11BQwen2VL 7BNvidia Dynamo · partnerKubernetesDataflow
Outcome
The VLM-based system achieves an 85% merchant acceptance rate for predicted categories, processes over 30 million predictions daily, and has doubled hierarchical precision and recall compared to the earlier neural network approach.
What failed first
The initial 2018 logistic regression with TF-IDF classifier was effective only for simple cases, and the 2020 multi-modal system still fell short because classifying categories alone was insufficient for comprehensive product understanding.