Statistical inference for individual fairness
Maity, Subha, Xue, Songkai, Yurochkin, Mikhail, Sun, Yuekai
As we rely on machine learning (ML) models to make more consequential decisions, the issue of ML models perpetuating or even exacerbating undesirable historical biases (e.g. In this paper, we focus on the problem of detecting violations of individual fairness in ML models. We formalize the problem as measuring the susceptibility of ML models against a form of adversarial attack and develop a suite of inference tools for the adversarial cost function. The tools allow auditors to assess the individual fairness of ML models in a statistically-principled way: form confidence intervals for the worst-case performance differential between similar individuals and test hypotheses of model fairness with (asymptotic) non-coverage/Type I error rate control. The problem of bias in machine learning systems is at the forefront of contemporary ML research. Numerous media outlets have scrutinized machine learning systems deployed in practice for violations of basic societal equality principles (Angwin et al., 2016; Dastin, 2018; Vigdor, 2019). In response researchers developed many formal definitions of algorithmic fairness along with algorithms for enforcing these definitions in ML models (Dwork et al., 2011; Hardt et al., 2016; Berk et al., 2017; Kusner et al., 2018; Ritov et al., 2017; Yurochkin et al., 2020). Despite the flurry of ML fairness research, the basic question of assessing fairness of a given ML model in a statistically principled way remains largely unexplored. In this paper we propose a statistically principled approach to assessing individual fairness (Dwork et al., 2011) of ML models.
Mar-30-2021
- Country:
- North America > United States (0.28)
- Africa > Ethiopia (0.14)
- Genre:
- Research Report (1.00)
- Industry:
- Law (0.93)
- Information Technology > Security & Privacy (0.34)
- Technology: