quality_assurance · saas · workflow

Canva builds synthetic data evaluation pipeline to improve private design search without accessing user data

Canva's engineers could only test private design search changes through a handful of manual queries on their own accounts due to strict privacy constraints preventing access to real user designs or queries, then had to wait days for online A/B experiments to validate changes.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Generate synthetic design content

GPT-4o is seeded with a realistic topic and sampled design type, prompted to brainstorm titles, then used again with a second prompt to generate corresponding text content.

Tools used

GPT-4oLLMsTestcontainersElasticSearchStreamlit

Outcome

The synthetic evaluation pipeline produces fully reproducible results on more than 1000 test cases in under 10 minutes, enabling more than 300 offline evaluations in the same time a single online experiment takes, all without accessing any real user data.

What failed first

Limited offline testing had low statistical power to catch poorly performing changes, and progressing quickly to online experiments risked exposing real users to degraded search behavior.

Results

Time savedmore than 1000 test cases in less than 10 minutes

Source

https://www.canva.dev/blog/engineering/how-to-improve-search-without-looking-at-queries-or-results/

How we source this →

Grounding & classification

Source type: technical build writeup

20 fields verified against source quotes, 1 dropped as unverifiable.

content generationenterprise searchknowledge basefailure mode describedmetric backednamed customerproduction runtime claimedtools describedvendor confirmedworkflow describedsoftwarecycle time reductionemployee productivitytechnical build writeupquality assurance