Practical Bias Mitigation through Proxy Sensitive Attribute Label Generation

Chaudhary, Bhushan, Pandey, Anubha, Bhatt, Deepak, Tiwari, Darshika

arXiv.org Artificial Intelligence 

Machine Learning has attained high success rates in practically Similarly, zip codes can be correlated with race. Hence, every field, including healthcare, finance, and education, the bias gets embedded in the non-sensitive attributes that based on the accuracy and efficiency of the model's are used in the model training. Based on this hypothesis, a outcome (Dastile, Çelik, and Potsane 2020; Bakator and few initial efforts have been made to mitigate bias in the Radosav 2018). However, these models are biased and exhibit absence of protected attributes (Grari, Lamprier, and Detyniecki a propensity to favor one demographic group over another 2022; Lahoti et al. 2020; Yan, Kao, and Ferrara in various applications, including credit and loan approval, 2020; Zhao et al. 2022). The most recent approach (Zhao criminal justice, and resume-based candidate shortlisting et al. 2022) identifies related features that are correlated with (Mehrabi et al. 2021; Gianfrancesco et al. 2018; Yapo the sensitive attributes and would further minimize the correlation and Weiss 2018). The idea of fairness has received a lot of between the related features and the model's prediction attention recently to combat the discrimination from the outcome to learn a fair classifier with respect to the sensitive of ML models (Dwork et al. 2012; Beutel et al. 2017; attribute. However, identification of related features require Hardt, Price, and Srebro 2016).