back_office_ops · workflow

Nextdoor's path from pre-trained to fine-tuned embedding models for notifications, feed, and search ranking

Nextdoor needed richer content representations to capture nuanced user signals and improve personalization across products, while managing the high storage and serving costs of large fixed-dimensionality embeddings updated daily at scale.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Content text extraction
Text from Nextdoor posts and comments is extracted from each post's subject and body and from comment text.
Tools used
Sentence-BERTSBERTpytorchSageMakerFeatureStoreAirflowHSNWlibBERTopicCLIP
Outcome

Fine-tuned embedding models delivered significant performance lifts in OKR metrics for notifications and feed, reduced null query rates significantly, improved query expansion latencies by more than 10x, and improved user-post cosine similarity by up to 16% while reducing embedding dimensionality by more than 10x.

What failed first

Pre-trained off-the-shelf models were trained on public benchmark datasets with semantics different from the Nextdoor domain, and their high fixed dimensionality caused significant storage and serving costs. Earlier word embedding models produced higher rates of null search queries.

Results
Volumemore than 10x
Running sinceearly 2022
Source

https://engblog.nextdoor.com/from-pre-trained-to-fine-tuned-nextdoors-path-to-effective-embedding-applications-3a13b56d91aa

How we source this →

Grounding & classification
Source type: technical build writeup
36 fields verified against source quotes.
enterprise searchpersonalizationpredictive analyticsrecommendation systemknowledge basesocial media postbuilder submittedmetric backednamed customerproduction runtime claimedsource backedtools describedworkflow describedsoftwareaccuracy improvementemployee productivityresponse time reductiontechnical build writeupback office opsdata sync enrichmentextract classify route