customer_support · saas · workflow

Elastic Field Engineering builds a GenAI customer support chatbot chat interface with RAG and streaming

Building a chat interface for a GenAI support assistant presented novel UI/UX challenges: users were left waiting with no feedback during slow LLM responses, streaming connections could hang silently for over a minute, and conveying complex multi-source conversation context inside a constrained UI required new design patterns.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · User submits question

A user asks a question via the streaming LLM API chat interface.

Tools used

EUIRAGLLM

Outcome

The team shipped a custom chat interface using their EUI component library with a branded loading animation, a 10-second killswitch for stalled streams, and a prepended context-selector UI that lets users choose and edit multiple context sources before submitting a question.

What failed first

The first LLM endpoint used for internal alpha-testing did not stream its responses, returning the entire answer in a single HTTP response body, which caused unacceptably long waits. Separately, live streaming connections would frequently return a 200 OK and then hang, with most failed streams taking over a minute to resolve.

Results

Time saved100 - 500ms

Volume1 - 2.5s

Source

https://www.elastic.co/search-labs/blog/genai-elastic-elser-chat-interface

How we source this →

Grounding & classification

Source type: technical build writeup

23 fields verified against source quotes, 1 dropped as unverifiable.

chatbotconversational aiknowledge searchragchat transcriptknowledge basemetric backednamed customertools describedworkflow describedsoftwareresponse time reductiontechnical build writeupcustomer supportit supportrag answering