ecommerce_ops · ecommerce · workflow

Rufus scales conversational shopping for 250M+ Amazon customers using Amazon Bedrock

Rufus was initially built on a custom in-house LLM optimized for the shopping domain, but training iterations took weeks or months, making it impossible to rapidly adopt advanced reasoning and larger context window capabilities as frontier models evolved.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Customer shopping query

A customer asks Rufus a factual product question or shopping recommendation.

Tools used

Amazon BedrockAmazon NovaClaude SonnetAmazon Nova Web Groundingprompt cachingLLM-as-a-judgeconverse API

Outcome

Adopting Amazon Bedrock increased development velocity by over 6x. Rufus now serves more than 250 million annual users, with monthly users up 140% YoY and interactions up 210% YoY. Customers using Rufus are 60% more likely to complete a purchase, and auto-buy users save an average of 20% per purchase.

What failed first

Off-the-shelf models evaluated before the custom LLM performed poorly in shopping domain evaluations, and larger third-party models added unacceptable latency and cost penalties.

Results

Time saved140% YoY

VolumeMore than 250 million

Source

https://aws.amazon.com/blogs/machine-learning/how-rufus-scales-conversational-shopping-experiences-to-millions-of-amazon-customers-with-amazon-bedrock?tag=soumet-20

How we source this →

Grounding & classification

Source type: technical build writeup

40 fields verified against source quotes.

agentic workflowai agentconversational aipersonalizationragrecommendation systemknowledge baseproduct catalogfailure mode describedmetric backednamed customerproduction runtime claimedtools describedvendor confirmedworkflow describedecommerceretailaccuracy improvementconversion increasecost reductionemployee productivitythroughput increasetechnical build writeupcustomer supportecommerce opsagentic task executionrag answering