QueryGPT: Uber builds a natural language to SQL system using LLMs and multi-agent architecture
Authoring SQL queries at Uber required deep knowledge of SQL syntax and internal data models and took around 10 minutes per query, creating a productivity bottleneck across approximately 1.2 million interactive queries per month.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · User submits natural language prompt
A user submits a natural language question to QueryGPT to generate a SQL query.
Tools used
large language modelsvector databasesRAGOpenAI GPT-4 Turbo
Outcome
QueryGPT reduced SQL query authoring time from around 10 minutes to about 3 minutes, reached about 300 daily active users in limited release, and 78% of users reported the generated queries reduced the time they would have spent writing from scratch.
What failed first
The initial version of QueryGPT used simple RAG over a small sample set and suffered declining accuracy as more tables were onboarded; simple similarity search between natural language prompts and SQL schemas returned irrelevant results, and large schemas exceeded the available LLM token limit.