Dropbox discovers and discloses repeated-token training data extraction vulnerability in GPT-3.5 and GPT-4
OpenAI's GPT-3.5 and GPT-4 models were vulnerable to divergence attacks triggered by repeated token sequences, allowing extraction of memorized training data including PII and sensitive information. After OpenAI deployed an initial mitigation filtering single-token repeats, multi-token repeat sequences still allowed exploitation of the same vulnerability.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Internal AI security review
Dropbox discovered a ChatGPT prompt injection vulnerability while performing an internal security review of its AI-powered products in April 2023.
Tools used
GPT-3.5GPT-4PythonLangchain
Outcome
After Dropbox's responsible disclosure in January 2024, OpenAI confirmed the training data extraction vulnerabilities and patched them by extending filtering to block multi-token repeat prompts and adding a server-side timeout for long-running requests.
What failed first
OpenAI's initial filtering defense focused exclusively on single-token repetitions (reflecting the emphasis of the Scalable Extraction paper), leaving multi-token repeat sequences able to induce the same model divergence and training data extraction from both GPT-3.5 and GPT-4.