ecommerce_ops · workflow

Space-efficient ML feature stores using bloom filters: a Zalando engineering benchmark

Conventional key-value-store-based ML feature stores become very large and impose significant challenges: network calls add 2–10ms of latency per lookup, distributed databases with strict performance requirements are expensive to host, backfill operations become very costly, and multiple feature lookups per request become prohibitively expensive under strict latency budgets.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · User request triggers lookup
A request is made to the recommender system, triggering a feature store query using the user ID.
Tools used
RedisBloom-Filter
Outcome

A bloom-filter-backed compressed feature store achieves the same click-prediction classification performance (AUC~=0.7997) as a conventional key-value store while using only 3% of the memory, with no detectable throughput overhead.

Results
Time saved2-10ms
Volume3%
Source

https://engineering.zalando.com/posts/2021/10/space-efficient-machine-learning-feature-stores-using-probabilistic-data-structures.html

How we source this →

Grounding & classification
Source type: technical build writeup
20 fields verified against source quotes.
predictive analyticsrecommendation systemmetric backednamed customersource backedtools describedworkflow describedecommercecost reductionthroughput increasetechnical build writeupecommerce opsdata sync enrichment