Multi-Differential Fairness Auditor for Black Box Classifiers
Gitiaux, Xavier, Rangwala, Huzefa
Machine learning algorithms are increasingly involved in sensitive decision-making process with adversarial implications on individuals. This paper presents mdfa, an approach that identifies the characteristics of the victims of a classifier's discrimination. We measure discrimination as a violation of multi-differential fairness. Multi-differential fairness is a guarantee that a black box classifier's outcomes do not leak information on the sensitive attributes of a small group of individuals. We reduce the problem of identifying worst-case violations to matching distributions and predicting where sensitive attributes and classifier's outcomes coincide. We apply mdfa to a recidivism risk assessment classifier and demonstrate that individuals identified as African-American with little criminal history are three-times more likely to be considered at high risk of violent recidivism than similar individuals but not African-American.
Mar-18-2019
- Country:
- North America > United States (0.94)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Law (1.00)
- Transportation > Air (0.60)
- Technology: