back_office_ops · workflow
Nautilus: Dropbox's ML-powered full-text search engine architecture
Dropbox needed a new search engine capable of handling its massive-scale document corpus with personalized, near-real-time results tailored to each user's access permissions and behaviors.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · File/user activity trigger
Users editing or sharing files generate index mutations that drive updates to the search index.
Tools used
NautilusApache TikaKafkaOctopusBM25
Outcome
Nautilus became the primary search engine at Dropbox after a shadow-mode qualification period, delivering significant improvements to time-to-index new and updated content.
Results
Time savedsignificant improvements to the time-to-index new and updated content
Grounding & classification
Source type: technical build writeup
20 fields verified against source quotes.
document aienterprise searchpersonalizationrecommendation systemknowledge baseproduction runtime claimedtools describedvendor confirmedworkflow describedsoftwarecycle time reductiontechnical build writeupback office opsdata sync enrichment