customer_support · healthcare · workflow

Sword Health: lessons learned shipping LLM-powered physical therapy AI agent Phoenix

Healthcare has long faced a dichotomy between quality and affordability. Shipping LLM-powered products in a highly regulated industry presents unique challenges in ensuring safety, consistency, and reliability, with inconsistency issues emerging once features reach production.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Patient session begins

A patient rehabilitation session begins, activating Phoenix's AI care agent support.

Tools used

PhoenixGondolaStreamlitGPT-4.0Claude 3.5 SonnetMySQLvector databaseLangfuseLangSmithRAGAS

Outcome

Sword Health shipped and iterated on many LLM-powered features across its product portfolio over several years, establishing a systematic development practice with guardrails, evals, RAG, and feedback loops. Switching from GPT-4.0 to Claude 3.5 Sonnet with minor prompt adjustments produced an increase in performance of around 10 percentage points.

Results

Volumearound 30%

Source

https://www.infoq.com/presentations/ai-healthcare-learnings/

How we source this →

Grounding & classification

Source type: technical build writeup

36 fields verified against source quotes.

ai agentconversational aiknowledge searchragsentiment analysischat transcriptknowledge basemedical recordfailure mode describedhuman review describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedhealthcareaccuracy improvementemployee productivitytechnical build writeupcustomer supportquality assuranceai draft human approvalhuman review queuerag answering