customer_support · saas · workflow

Elastic Field Engineering builds a GenAI customer support chatbot chat interface with RAG and streaming

Building a chat interface for a GenAI support assistant presented novel UI/UX challenges: users were left waiting with no feedback during slow LLM responses, streaming connections could hang silently for over a minute, and conveying complex multi-source conversation context inside a constrained UI required new design patterns.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · User submits question
A user asks a question via the streaming LLM API chat interface.
Tools used
EUIRAGLLM
Outcome

The team shipped a custom chat interface using their EUI component library with a branded loading animation, a 10-second killswitch for stalled streams, and a prepended context-selector UI that lets users choose and edit multiple context sources before submitting a question.

What failed first

The first LLM endpoint used for internal alpha-testing did not stream its responses, returning the entire answer in a single HTTP response body, which caused unacceptably long waits. Separately, live streaming connections would frequently return a 200 OK and then hang, with most failed streams taking over a minute to resolve.

Results
Time saved100 - 500ms
Volume1 - 2.5s
Source

https://www.elastic.co/search-labs/blog/genai-elastic-elser-chat-interface

How we source this →

Grounding & classification
Source type: technical build writeup
23 fields verified against source quotes, 1 dropped as unverifiable.
chatbotconversational aiknowledge searchragchat transcriptknowledge basemetric backednamed customertools describedworkflow describedsoftwareresponse time reductiontechnical build writeupcustomer supportit supportrag answering