How Pinterest Built a Real-Time Radar for Violative Content Using AI
Pinterest's Trust & Safety teams could only measure policy-violating content prevalence through expensive human review studies run roughly every six months. User reports alone were incomplete, missing under-reported harms and lacking statistical power for rare categories.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Daily impressions stream sampled
Images are sampled from the daily user impressions stream to start the prevalence measurement workflow.
Tools used
multimodal LLM
Outcome
The AI-assisted workflow enables daily prevalence measurement at 15x faster labeling turnaround and orders of magnitude lower operational cost than a human-only workflow, while preserving comparable decision quality and enabling continuous monitoring with real-time alerting.
What failed first
Human-only prevalence studies required at least two independent reviewers per item plus adjudication, were subject to instability, ran infrequently, and produced post-intervention comparisons that were slow and hard to trust.
Results
Time savedroughly every six months
Volume15x faster
Cost replacedorders of magnitude lower operational cost