recruiting · saas · workflow
Malt super-powers freelancer recommendation with retriever-ranker architecture and Qdrant vector database
Malt's original monolithic matching model had response times of up to one minute and was inflexible, making it difficult to adapt for future large language models or scale to real-time recommendation needs.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Project posted, vector encoded
When a new project is posted, its details are encoded into a vector in real-time.
Tools used
QdrantKubernetesGrafanaPrometheusDocker
Outcome
After deploying the retriever-ranker architecture backed by Qdrant, p95 latency fell from tens of seconds (sometimes over a minute) to 3 seconds at most, and an AB test confirmed an increase in project conversion without sacrificing recommendation quality.
Results
Time saved3 seconds at most
Running since2023
Grounding & classification
Source type: technical build writeup
21 fields verified against source quotes.
recommendation systemresumemetric backednamed customerproduction runtime claimedsource backedtools describedworkflow describedsoftwareconversion increaseresponse time reductiontechnical build writeuprecruitingextract classify route