Instacart builds Maple to enable large-scale LLM batch processing across engineering teams
Instacart's engineering teams needed to run millions of LLM calls for catalog enrichment, fulfillment routing, and search relevance workflows, but real-time LLM provider APIs were not designed for that scale, causing rate limiting, duplicated infrastructure code across teams, and growing cost pressure.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Team submits file and prompt
Teams submit a CSV or Parquet input file along with a prompt to Maple to initiate large-scale LLM processing.
Maple saves up to 50% on LLM costs compared to real-time calls, reduced many processes from hundreds of thousands of dollars per year to just thousands, and has become part of the backbone of Instacart's AI infrastructure supporting 10M+ prompt jobs.
What failed first
Before Maple, rate-limited real-time LLM calls introduced delays, each team independently wrote batch processing code from scratch, and pipelines lacked reusability requiring code modifications for every new use case.
Results
Time savedmost batches complete in under 12 hours
Volumereduced from hundreds of thousands of dollars per year to just thousands of dollars per year