ecommerce_ops · workflow

Shopify builds real-time ML embedding pipelines processing 2,500 embeddings per second for Semantic Search

Shopify merchants needed search results that reflected consumer intent beyond keyword matching, and product and image updates needed to be reflected in search results nearly instantly after creation or modification.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Event triggers pipeline

The pipeline listens to new events from an input event topic that signals an image has been created or modified on a merchant's website.

Tools used

DataflowBigQueryApache BeamT4 GPU

Outcome

Shopify now processes roughly 2,500 embeddings per second (216 million per day) in near real time across image and text pipelines, achieved a ~2.6x memory footprint decrease, and eliminated the extra 14% in cost by reverting to n1-standard-16 machines.

What failed first

The initial image embedding pipeline ran into Out of Memory errors on n1-standard-16 machines; switching to n1-highmem-16 machines resolved the OOM issue but increased costs by 14%. Batching also proved ineffective — due to bursty input topics, elements were being organized in bundles of 1, sending batches of 1 to the GPU.

Results

Time saved216 million

Volume2,500

Cost replaced14%

Source

https://shopify.engineering/how-shopify-improved-consumer-search-intent-with-real-time-ml

How we source this →

Grounding & classification

Source type: technical build writeup

22 fields verified against source quotes, 1 dropped as unverifiable.

computer visionenterprise searchproduct catalogfailure mode describedmetric backednamed customerproduction runtime claimedtools describedsoftwarecost reductioncycle time reductionthroughput increasetechnical build writeupecommerce opsdata sync enrichment