back_office_ops · workflow

Shopify Merlin Online Inference: deploying ML models for real-time predictions at scale

Shopify needed a generalized, low-latency online inference solution to serve real-time ML predictions across many internal teams, each with distinct use-case requirements, ML frameworks, and integration needs.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Merlin Project creation
A Merlin Project folder containing the code, configuration, and tests for the ML use case is created in the Merlin mono-repo.
Tools used
RayComet MLFeastTensorFlowPyTorchXGBoostMLServerFastAPIGoogle Kubernetes EngineFlinkBuildkitePodmanLightGBMScikit-learnHugging Face
Outcome

Merlin Online Inference is in production, empowering data science teams with low latency, scalability, and fast iteration cycles for ML model serving across use cases including fraud detection, product categorization, and inbox classification.

Source

https://shopify.engineering/shopifys-machine-learning-platform-real-time-predictions

How we source this →

Grounding & classification
Source type: technical build writeup
32 fields verified against source quotes.
document classificationfraud detectionpredictive analyticsrecommendation systemproduct catalognamed customerproduction runtime claimedtools describedworkflow describedecommercesoftwareemployee productivityresponse time reductiontechnical build writeupback office opsdata sync enrichment