quality_assurance · saas · workflow

New Computer improves Dot's memory retrieval by 50% recall and 40% precision with LangSmith

New Computer needed to rapidly iterate on diverse memory retrieval methods for Dot's agentic memory system while preserving user privacy, facing a combinatorial explosion of experiments as they tested multiple retrieval techniques in parallel.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Synthetic user cohort generation

A cohort of synthetic users with LLM-generated backstories is created to enable privacy-preserving retrieval testing.

Tools used

LangSmithBM25

Outcome

New Computer achieved 50% higher recall and 40% higher precision compared to their baseline dynamic memory retrieval, and greatly improved team iteration speed for evaluating and adjusting conversation prompts.

What failed first

The initial baseline used simple semantic search retrieving a fixed number of memories per query, which proved insufficient across diverse query types where BM25 or meta-field pre-filtering performed better.

Results

Volume50% higher recall

Source

https://blog.langchain.dev/customers-new-computer/

How we source this →

Grounding & classification

Source type: vendor customer story

21 fields verified against source quotes.

agentic workflowconversational aipersonalizationragknowledge basemetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwareaccuracy improvementemployee productivityvendor customer storyquality assurancerag answering