Booking.com uses supervised fine-tuning with LoRA/QLoRA to achieve 3x faster travel destination recommendations
Travelers increasingly express vacation needs in unstructured, nuanced natural language that traditional ML models struggle to handle, while prompt-based LLM solutions raise privacy concerns and cannot leverage Booking.com's proprietary behavioral data.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · User submits travel request
Users describe their travel needs in natural language through the AI Trip Planner's conversational interface.
Tools used
LoRAQLoRALLM as a judge
Outcome
The fine-tuned open-weight model reduced p99 inference latency by 67% (about 3x faster) over the baseline, improved Hit@5 by 8% from incorporating user location context, and delivered strong improvements in recommendation quality validated via A/B test, while keeping all user data internal.
What failed first
The baseline production system — a prompt-based proprietary LLM accessed via external API combined with a traditional ML model — had slower inference and could not safely incorporate sensitive user data due to external processing requirements.