ecommerce_ops · workflow

Space-efficient ML feature stores using bloom filters: a Zalando engineering benchmark

Conventional key-value-store-based ML feature stores become very large and impose significant challenges: network calls add 2–10ms of latency per lookup, distributed databases with strict performance requirements are expensive to host, backfill operations become very costly, and multiple feature lookups per request become prohibitively expensive under strict latency budgets.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · User request triggers lookup

A request is made to the recommender system, triggering a feature store query using the user ID.

Tools used

RedisBloom-Filter

Outcome

A bloom-filter-backed compressed feature store achieves the same click-prediction classification performance (AUC~=0.7997) as a conventional key-value store while using only 3% of the memory, with no detectable throughput overhead.

Results

Time saved2-10ms

Volume3%

Source

https://engineering.zalando.com/posts/2021/10/space-efficient-machine-learning-feature-stores-using-probabilistic-data-structures.html

How we source this →

Grounding & classification

Source type: technical build writeup

20 fields verified against source quotes.

predictive analyticsrecommendation systemmetric backednamed customersource backedtools describedworkflow describedecommercecost reductionthroughput increasetechnical build writeupecommerce opsdata sync enrichment