Workflow · workflow

Duolingo builds structured LLM prompt system with persistent cross-session memory to power AI speaking practice with Lily

LLMs alone cannot serve as effective language tutors; simply instructing a model to speak a target language with a learner is insufficient, and purpose-specific prompting with predictable structure is required.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Pre-call question generation

While the call is connecting, the system uses an LLM to generate an appropriate first question for the learner.

Outcome

Duolingo implemented separate LLM instructions for pre-call question generation, main conversation, and mid-call evaluation, plus a post-call 'List of Facts' memory system enabling Lily to recall personal details about users across sessions.

What failed first

Bundling all call instructions into a single prompt overloaded the LLM, causing it to produce overly complex output or forget prepared vocabulary. In a live call, Lily also ignored a user's topic change and returned to an unrelated scheduled subject.

Source

https://blog.duolingo.com/ja/duolingo-ai-for-speaking-practice/

How we source this →

Grounding & classification

Source type: technical build writeup

12 fields verified against source quotes, 1 dropped as unverifiable.

conversational aipersonalizationsummarizationvoice aicall recordingfailure mode describednamed customerproduction runtime claimedtools describedworkflow describededucationtechnical build writeupagentic task execution