Instacart builds PARSE, a multi-modal LLM platform for catalog attribute extraction at scale
Instacart's catalog attribute creation relied on SQL-based rules and traditional ML models that struggled with complex or context-dependent attributes, required significant per-attribute engineering effort, and could not extract information from product images — resulting in slow development cycles and inconsistent attribute quality.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Configure attribute extraction task
Teams use the platform in development mode to experiment with different models, prompts, and input sources.
Tools used
LLMsPARSE
Outcome
PARSE accelerated attribute extraction: simpler attributes now take one day of effort compared to one week previously, complex attribute iteration was reduced to just three days, multi-modal LLMs increased recall by 10% over text-only models, and simpler attributes can be handled at a 70% cost reduction using less powerful models.
What failed first
Pre-LLM approaches — SQL rules and traditional ML models — failed to scale: SQL handled only simple keyword-based extractions, ML required separate labeled datasets and pipelines per attribute, and neither could process image-based product data.
Results
Time savedone day of effort, compared to one week previously