back_office_ops · workflow

Shopify Merlin Online Inference: deploying ML models for real-time predictions at scale

Shopify needed a generalized, low-latency online inference solution to serve real-time ML predictions across many internal teams, each with distinct use-case requirements, ML frameworks, and integration needs.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Merlin Project creation

A Merlin Project folder containing the code, configuration, and tests for the ML use case is created in the Merlin mono-repo.

Tools used

RayComet MLFeastTensorFlowPyTorchXGBoostMLServerFastAPIGoogle Kubernetes EngineFlinkBuildkitePodmanLightGBMScikit-learnHugging Face

Outcome

Merlin Online Inference is in production, empowering data science teams with low latency, scalability, and fast iteration cycles for ML model serving across use cases including fraud detection, product categorization, and inbox classification.

Source

https://shopify.engineering/shopifys-machine-learning-platform-real-time-predictions

How we source this →

Grounding & classification

Source type: technical build writeup

32 fields verified against source quotes.

document classificationfraud detectionpredictive analyticsrecommendation systemproduct catalognamed customerproduction runtime claimedtools describedworkflow describedecommercesoftwareemployee productivityresponse time reductiontechnical build writeupback office opsdata sync enrichment