Workflow · saas · workflow

Dropbox security research: prompt injection via control characters in GPT-3.5 and GPT-4

Dropbox's security team identified that user-controlled control characters in LLM prompt inputs can circumvent system-level instructions, enabling prompt injection attacks that cause models to betray context constraints or hallucinate.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · LLM injection threat identified
Dropbox's security team identifies injection attacks on LLM queries as a focus area for hardening internal infrastructure.
Tools used
GPT-3.5GPT-4ChatGPTPython 3
Outcome

Dropbox demonstrated that prepending sufficient control characters to LLM prompt inputs causes GPT-3.5 and GPT-4 to betray their system instructions and hallucinate; the team shared findings with OpenAI and identified input sanitization as the primary mitigation.

What failed first

The prompt template designed to constrain LLM queries to a specific context and prevent instruction leakage was defeated when sufficient control characters were prepended to the question parameter, regardless of instruction wording or formatting.

Results
Volume350 or more
Source

https://dropbox.tech/machine-learning/prompt-injection-with-control-characters-openai-chatgpt-llm

How we source this →

Grounding & classification
Source type: technical build writeup
17 fields verified against source quotes.
conversational aifailure mode describedmetric backednamed customersource backedtools describedworkflow describedsoftwaretechnical build writeup