compliance_monitoring · workflow

Zillow uses NLP, trigram analysis, and LDA topic modeling to detect racial proxies in real estate listing descriptions

Real estate listing descriptions are increasingly used as inputs to AI systems at Zillow, but the text may function as a proxy for protected classes such as race — reflecting historical housing segregation — which could cause those AI systems to reproduce or amplify discrimination in ways that had not been systematically measured.

How it works
Common implementation structure
How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.
Stage 1 · Listing data snapshot
Listing descriptions from all actively listed single-family homes on January 25th, 2024 were collected for analysis.
Tools used
LLMNLPLDApyLDAvis
Outcome

The analysis confirmed statistically significant differences in text length, key phrases, and topic distributions between listing descriptions in majority non-Hispanic white and majority Black neighborhoods, establishing that listing description text can function as a racial proxy in downstream AI systems.

Results
Volume144.03
Source

https://www.zillow.com/tech/using-ai-to-understand-the-complexities-and-pitfalls-of-real-estate-data/

How we source this →

Grounding & classification
Source type: technical build writeup
20 fields verified against source quotes.
data extractiondocument aisummarizationknowledge basemetric backednamed customerproduction runtime claimedtools describedworkflow describedreal estatetechnical build writeupcompliance monitoringquality assuranceextract classify route