back_office_ops · saas · workflow
Wix builds a domain-adapted custom LLM that outperforms GPT-3.5 on Wix-specific tasks
Standard LLM customization techniques—prompt engineering, RAG, and task-specific fine-tuning—suffered from high cost, high latency, model hallucination, and an inability to handle multiple domain tasks simultaneously.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Build domain evaluation dataset
A custom Wix Q&A dataset was built from existing customer service live chats and FAQs to estimate domain knowledge.
Tools used
RAGLLaMa2LoRAAWS P5
Outcome
Wix's smaller customized LLM showed better results than GPT 3.5 models on a variety of Wix tasks and opened the door for more impact in the organization.
What failed first
Prompt engineering and RAG required overly complex prompts prone to overfitting, vendor-provided fine-tuning services overfitted to specific prompts without yielding cross-domain capabilities, and existing LLM and RAG solutions did not perform well enough on Wix domain tasks.
Results
Volume2%
Grounding & classification
Source type: technical build writeup
20 fields verified against source quotes, 1 dropped as unverifiable.
data extractionknowledge searchragsentiment analysissummarizationchat transcriptknowledge basefailure mode describedmetric backedsource backedtools describedworkflow describedsoftwareaccuracy improvementtechnical build writeupback office ops