Shopify builds real-time ML embedding pipelines processing 2,500 embeddings per second for Semantic Search
Shopify merchants needed search results that reflected consumer intent beyond keyword matching, and product and image updates needed to be reflected in search results nearly instantly after creation or modification.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Event triggers pipeline
The pipeline listens to new events from an input event topic that signals an image has been created or modified on a merchant's website.
Tools used
DataflowBigQueryApache BeamT4 GPU
Outcome
Shopify now processes roughly 2,500 embeddings per second (216 million per day) in near real time across image and text pipelines, achieved a ~2.6x memory footprint decrease, and eliminated the extra 14% in cost by reverting to n1-standard-16 machines.
What failed first
The initial image embedding pipeline ran into Out of Memory errors on n1-standard-16 machines; switching to n1-highmem-16 machines resolved the OOM issue but increased costs by 14%. Batching also proved ineffective — due to bursty input topics, elements were being organized in bundles of 1, sending batches of 1 to the GPU.