FairHome: A Fair Housing and Fair Lending Dataset

Bagalkotkar, Anusha, Karmakar, Aveek, Arnson, Gabriel, Linda, Ondrej

Sep-9-2024–arXiv.org Artificial Intelligence

We present a Fair Housing and Fair Lending dataset (FairHome): A dataset with around 75,000 examples across 9 protected categories. To the best of our knowledge, FairHome is the first publicly available dataset labeled with binary labels for compliance risk in the housing domain. We demonstrate the usefulness and effectiveness of such a dataset by training a classifier and using it to detect potential violations when using a large language model (LLM) in the context of real-estate transactions. We benchmark the trained classifier against state-of-the-art LLMs including GPT-3.5, GPT-4, LLaMA-3, and Mistral Large in both zero-shot and fewshot contexts. Our classifier outperformed with an F1-score of 0.91, underscoring the effectiveness of our dataset. WARNING: Some of the examples included in the paper are not polite, in so far as they reveal bias that might feel discriminatory to the readers.

category, fairhome, neighborhood, (15 more...)

arXiv.org Artificial Intelligence

Sep-9-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Washington > King County
    - Seattle (0.14)
    - Kirkland (0.04)
  - Texas > Travis County
    - Austin (0.04)
- Europe > Croatia
  - Dubrovnik-Neretva County > Dubrovnik (0.04)
- Asia > China
  - Hong Kong (0.04)

Genre:
- Research Report (0.65)

Industry:
- Law (1.00)
- Banking & Finance > Real Estate (1.00)
- Government > Regional Government
  - North America Government > United States Government (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found