back_office_ops · workflow

Uber's GenAI Gateway: unified LLM platform serving 16 million queries per month across close to 30 teams

Disparate LLM integration strategies across Uber's engineering teams led to inefficiencies and redundant efforts, while a rapidly growing number of LLM use cases made a centralized approach necessary.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Security review before access
A standardized review process, managed by the Engineering Security team, reviews use cases against Uber's data handling standard before access to the gateway is granted.
Tools used
GenAI GatewayOpenAIVertex AILangChainLlamaIndexSTOA
Outcome

GenAI Gateway is used by close to 30 customer teams and serves 16 million queries per month with a peak QPS of 25, providing a single OpenAI-compatible interface to models from OpenAI, Vertex AI, and Uber-hosted LLMs.

Results
Time saved16 million
Volumeover 60
Source

https://www.uber.com/mx/en/blog/genai-gateway/

How we source this →

Grounding & classification
Source type: technical build writeup
24 fields verified against source quotes.
content generationsupport agentfailure mode describedhuman review describedmetric backednamed customerproduction runtime claimedsource backedtools describedworkflow describedsoftwareemployee productivitythroughput increasetechnical build writeupback office ops