Booking.com builds Granomaly: a statistical anomaly detection service for time series business metrics
Static thresholds and naive week-over-week comparison failed to reliably catch anomalies in fluctuating business metrics like daily sales or order volume, because an anomaly in one week became the flawed baseline for the next.
Granomaly produces a smoothed prediction range that Grafana uses to detect both sudden outages and slow gradual declines, handles overlapping historical anomalies, and supports event-specific corrections; a simulation feature reduced the parameter-tuning feedback loop from days to seconds.
Several approaches were tried and abandoned before arriving at the final design: z-score alerting caused a spike in false alarms at night due to low user activity, was not human-readable, and Graphite lacked usable sliding-window support. A percentile-based range was distorted by overlapping past outages, and an approach that excluded the most deviant historical week always removed a data point even when no true outlier existed, producing an unstable range.