Removing biased data to improve fairness and accuracy

Verma, Sahil, Ernst, Michael, Just, Rene

Feb-5-2021–arXiv.org Artificial Intelligence

Machine learning systems are often trained using data collected from historical decisions. If past decisions were biased, then automated systems that learn from historical data will also be biased. We propose a black-box approach to identify and remove biased training data. Machine learning models trained on such debiased data (a subset of the original training data) have low individual discrimination, often 0%. These models also have greater accuracy and lower statistical disparity than models trained on the full historical data. We evaluated our methodology in experiments using 6 real-world datasets. Our approach outperformed seven previous approaches in terms of individual discrimination and accuracy.

dataset, discrimination, proceedings, (14 more...)

arXiv.org Artificial Intelligence

Feb-5-2021

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - New South Wales > Sydney (0.04)
  - Western Australia > Perth (0.04)
- North America
  - United States
    - Michigan (0.04)
    - Ohio (0.04)
    - Nevada > Clark County
      - Las Vegas (0.04)
    - Florida > Broward County
      - Fort Lauderdale (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Pennsylvania > Philadelphia County
      - Philadelphia (0.04)
    - Georgia > Fulton County
      - Atlanta (0.04)
    - Washington > King County
      - Seattle (0.04)
    - Massachusetts
      - Suffolk County > Boston (0.04)
      - Middlesex County > Cambridge (0.04)
    - California
      - Los Angeles County > Long Beach (0.04)
      - San Diego County > San Diego (0.04)
    - New York > New York County
      - New York City (0.06)
  - Canada
    - Quebec > Montreal (0.04)
    - Nova Scotia > Halifax Regional Municipality
      - Halifax (0.04)
- Europe
  - Germany (0.14)
  - United Kingdom > England
    - Bristol (0.04)
  - Sweden
    - Stockholm > Stockholm (0.04)
    - Vaestra Goetaland > Gothenburg (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.05)
  - Netherlands > North Holland
    - Amsterdam (0.04)
  - France > Occitanie
    - Hérault > Montpellier (0.04)
  - Estonia > Harju County
    - Tallinn (0.04)
  - Belgium > Flanders
    - Flemish Brabant > Leuven (0.04)
- Asia > China
  - Hunan Province > Changsha (0.04)

Genre:
- Research Report (0.82)

Industry:
- Banking & Finance (0.67)
- Law > Civil Rights & Constitutional Law (0.67)
- Government > Regional Government (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Issues > Social & Ethical Issues (0.66)
  - Machine Learning > Performance Analysis
    - Accuracy (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found