Deep Contrastive Learning for Feature Alignment: Insights from Housing-Household Relationship Inference

Qian, Xiao, Dong, Shangjia, Davidson, Rachel

Feb-16-2025–arXiv.org Artificial Intelligence

Housing and household characteristics are key determinants of social and economic well-being, yet our understanding of their interrelationships remains limited. This study addresses this knowledge gap by developing a deep contrastive learning (DCL) model to infer housing-household relationships using the American Community Survey (ACS) Public Use Microdata Sample (PUMS). More broadly, the proposed model is suitable for a class of problems where the goal is to learn joint relationships between two distinct entities without explicitly labeled ground truth data. Our proposed dual-encoder DCL approach leverages co-occurrence patterns in PUMS and introduces a bisect K-means clustering method to overcome the absence of ground truth labels. The dual-encoder DCL architecture is designed to handle the semantic differences between housing (building) and household (people) features while mitigating noise introduced by clustering. To validate the model, we generate a synthetic ground truth dataset and conduct comprehensive evaluations. The model further demonstrates its superior performance in capturing housing-household relationships in Delaware compared to state-of-the-art methods. A transferability test in North Carolina confirms its generalizability across diverse sociodemographic and geographic contexts. Finally, the post-hoc explainable AI analysis using SHAP values reveals that tenure status and mortgage information play a more significant role in housing-household matching than traditionally emphasized factors such as the number of persons and rooms.

artificial intelligence, housing unit, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Feb-16-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > North Carolina (0.25)

Genre:
- Research Report
  - New Finding (0.46)
  - Promising Solution (0.48)

Industry:
- Banking & Finance > Real Estate (0.55)
- Health & Medicine > Health Care Technology (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.68)
  - Performance Analysis > Accuracy (0.95)
  - Statistical Learning > Clustering (0.86)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found