legal_document_review · services · workflow

Harvey Builds Enterprise-Grade RAG Systems for Legal and Professional Services Using LanceDB and Postgres

Deploying enterprise RAG for legal and professional services requires solving for sparse vs. dense retrieval trade-offs, performance and accuracy at massive scale, complex domain-specific data structures, and strict data privacy requirements that prevent sensitive client documents from leaving customer-controlled environments.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Multi-source document ingestion

Documents arrive from three sources: user-uploaded thread files, long-term Vault project documents, and private and public third-party legal data sources providing regulations, laws, and statutes.

Tools used

LanceDB EnterprisePostgresPGVector

Outcome

Harvey now serves users in 45 countries, and its PwC Tax AI collaboration produced a system 91% preferred over off-the-shelf ChatGPT, bringing highly accurate answers to hundreds of professional service firms worldwide.

Results

Time saved<2s P50

Volume91%

Source

https://www.harvey.ai/blog/enterprise-grade-rag-systems

How we source this →

Grounding & classification

Source type: technical build writeup

23 fields verified against source quotes, 1 dropped as unverifiable.

document aienterprise searchknowledge searchragknowledge basepolicy documentmetric backednamed customerproduction runtime claimedtools describedworkflow describedlegalprofessional servicesaccuracy improvementtechnical build writeuplegal document reviewlegal opsrag answering