recruiting · workflow
DPG Media ML platform: job classification, content-based ad targeting, and HR embeddings
DPG Media needed to serve relevant ads without user tracking, match job seekers to postings at scale across 13 online brands, and resolve failures in its HR-domain language model when encountering out-of-vocabulary words.
How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Job ad classification
ML classifies job ads by type, such as data scientist versus data engineer postings.
Tools used
TensorFlowHuggingfaceSagemakerAirflowAWSLambdaECSs3DynamoDBSnowflakeDockerdbtMLFlowPrometheusGrafanaDatabricksterraformSeldon Core
Outcome
Out-of-vocabulary errors were resolved pragmatically: Levenshtein distance correction for misspellings and manual synonym mapping for novel terms, with corrected vectors added to the model and a full retrain avoided.
What failed first
The HR domain language model failed on out-of-vocabulary words — misspelled, archaic, or entirely novel job titles caused disproportionate errors that the model's existing fallback methods could not handle.
Results
Volumeover 13 million
Grounding & classification
Source type: technical build writeup
39 fields verified against source quotes.
data extractiondocument classificationforecastingpersonalizationrecommendation systemresumefailure mode describedhuman review describednamed customerproduction runtime claimedtools describedworkflow describedmediaerror reductiontechnical build writeupdata entry opsmarketing opsrecruitingdocument to recordextract classify route