Workflow · education · workflow

How Duolingo designs structured LLM-powered conversations for Video Call with Lily

Letting an LLM converse freely with language learners produces generic, off-character, off-level responses; Duolingo needed a structured pipeline to ensure every Video Call with Lily stays at the right CEFR level, matches Lily's established personality, and has a clear conversational purpose.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Learning designer writes system instructions

Duolingo Learning Designers write the instructions that the System gives to the Assistant (Lily) about how to act and what to say.

Tools used

ChatGPTClaudeGemini

Outcome

Duolingo built a structured multi-prompt pipeline for Video Call with Lily featuring separate first-question generation, persistent user memory via a List of Facts, and dynamic mid-call evaluation, enabling personalized, level-appropriate speaking practice.

What failed first

Two failure modes emerged during development: combining all instructions into one prompt overloaded the LLM and produced overly complex sentences or missing vocabulary; and without mid-call evaluation, Lily would ignore learner cues and stay on her pre-assigned topic regardless of what the learner wanted to discuss.

Results

Volumedelight and sass—and, of course, the opportunity for speaking practice

Source

https://blog.duolingo.com/ai-and-video-call/

How we source this →

Grounding & classification

Source type: technical build writeup

19 fields verified against source quotes, 1 dropped as unverifiable.

content generationconversational aipersonalizationsummarizationchat transcriptfailure mode describedhuman review describednamed customerproduction runtime claimedtools describedworkflow describededucationtechnical build writeupagentic task execution