back_office_ops · workflow

Canva recommendation system: handling empty results, irrelevant outputs, and production failures at 60M+ user scale

Canva's personalization system serving over 60 million monthly active users faces two recurring failure classes: unexpected results (empty recommendations from cold-start or low model confidence, and irrelevant outputs from model imperfections) and failure to respond (high latency from large deep learning models and horizontal scaling limits hit during peak traffic while most engineers are asleep in Australia).

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · User template interaction triggers cycle

Every time a user interacts with a template, the recommendation update cycle is initiated in the background.

Tools used

deep learning modelscaching layer

Outcome

Canva mitigates recommendation failures through locale- and platform-specific fallbacks, near-line inference caching to keep recommendations reactive to user interactions, metric-threshold deployment gates, visual model reports for debugging, auto-scaling policies, and independent per-model controllers enabling rollback or switch-off during incidents without affecting other models.

What failed first

Recommendation models have produced no results or irrelevant results; horizontal scaling limits have been hit multiple times due to Canva's fast-growing user base or new models requiring larger machines; and some models take around 15 to 20 minutes to scale, making roll-forward during incidents impractical.

Results

Time savedmore than 60 million

Source

https://www.canva.dev/blog/engineering/recommender-systems-when-they-fail-who-are-you-gonna-call/

How we source this →

Grounding & classification

Source type: technical build writeup

14 fields verified against source quotes.

personalizationrecommendation systemproduct catalogfailure mode describednamed customerproduction runtime claimedtools describedworkflow describedsoftwaretechnical build writeupback office opsescalation workflowmonitor detect alert