customer_support · ecommerce · workflow

Instacart builds LACE, an LLM-based automated evaluation framework for its customer support chatbot

Instacart needed a reliable, scalable way to evaluate whether its AI-powered customer support chatbot was actually helping customers in real conversations, because human evaluation alone could not scale and the chatbot's quality was difficult to measure objectively.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Chat session submitted to LACE

Each full multi-turn conversation between a customer and the support chatbot is submitted to LACE for evaluation.

Tools used

o1-preview

Outcome

LACE provides automated evaluation closely aligned with human judgment, achieving over 90% accuracy on context-dependent criteria and enabling continuous chatbot improvement through dashboard-driven feedback loops that reduced inefficient interactions.

Results

Volumeover 90%

Source

https://tech.instacart.com/turbocharging-customer-support-chatbot-development-with-llm-based-automated-evaluation-6a269aae56b2

How we source this →

Grounding & classification

Source type: technical build writeup

22 fields verified against source quotes.

agentic workflowchatbotconversational aimulti agent workflowchat transcripthuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedecommerceaccuracy improvementcustomer satisfactiontechnical build writeupcustomer supportquality assuranceai draft human approvalmonitor detect alert