Zalando migrates real-time fraud detection from Python/scikit-learn to Scala/Spark for platform scale
Zalando's Python-based fraud detection system could not scale to the demands of its expanding fashion platform: the Python GIL blocked multithreading for concurrent predictions, training data exhausted in-house cluster memory, JSON configuration became unmanageable, and shared cluster resources created bottlenecks.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Order received as JSON
Order data arrives in the form of a JSON request to the prediction service.
Tools used
scikit-learnCherryPyPlay frameworkMLlib
Outcome
The new Scala and Spark system on AWS reduced overall learning time by a factor of two, cut prediction response time at 20 concurrent requests from ~1000 ms to ~70 ms, and a sparse feature condenser improved prediction accuracy by more than 25% while more than halving runtime.
What failed first
The original Python system using CherryPy for serving requests and scikit-learn for ML on a static in-house cluster failed to scale: Python's GIL prevented concurrent predictions, cluster memory capped training data size, and growing JSON config complexity blocked safe refactoring.