back_office_ops · saas · workflow

Deploying DeepSeek-R1 on AWS: Our Journey Through Performance, Cost, and Reality

The team wanted to evaluate whether self-hosted open-source LLMs could replace paid AI coding assistants, motivated by goals of control, privacy, and long-term cost savings, but faced uncertainty about operational viability at startup scale.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Decide to evaluate self-hosted LLMs

The team decided to explore deploying open-source DeepSeek-R1 models in-house to evaluate their viability as alternatives to paid services like ChatGPT.

Tools used

DeepSeek-R1AWS EC2DockerNVIDIA Container ToolkitOllamaOpenWeb UI

Outcome

Self-hosting was found not cost-effective for startups at small-to-medium scale: a single AWS instance cost around $414/month while comparable SaaS offerings were drastically cheaper, with hidden operational overhead further widening the gap.

What failed first

The 16B model crashed under load even after aggressive quantization and batch-size reductions; performance degraded below usable token speeds with longer context windows; and tuning parameters introduced unpredictable latency and crashes.

Results

Volume$20/user/month

Cost replacedaround $414/month

Source

https://liftoffllc.medium.com/deploying-deepseek-r1-on-aws-our-journey-through-performance-cost-and-reality-48168c6125d5

How we source this →

Grounding & classification

Source type: technical build writeup

21 fields verified against source quotes.

code generationconversational aibuilder submittedfailure mode describedmetric backedtools describedworkflow describedsoftwarecost reductionemployee productivitytechnical build writeupback office ops