Podium reduces engineering intervention by 90% and improves AI Employee accuracy using LangSmith
Podium's AI Employee processed 20-30 LLM calls per interaction, making it hard to understand agent behavior or debug issues without engineering involvement. The TPS support team lacked visibility into LLM inputs and outputs needed to resolve customer-reported problems independently.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Customer inquiry to AI Employee
Podium's AI Employee receives customer inquiries on behalf of local businesses.
Tools used
LangChainLangSmithLangGraph
Outcome
After fine-tuning with LangSmith-curated datasets, Podium's AI Employee F1 scores improved by 7.5% from 91.7% to 98.6%, exceeding their quality threshold. Engineering intervention for support issues was reduced by 90%, and customer satisfaction scores improved.
What failed first
The AI Employee struggled to recognize when a conversation had naturally ended, resulting in awkward repeated goodbyes. Resolving such issues required calling in engineers to review model inputs and outputs and rewrite code.