The more the merrier: logical and multistage processors in credit scoring

Pérez-Peralta, Arturo, Benítez-Peña, Sandra, Lillo, Rosa E.

arXiv.org Artificial Intelligence 

Machine Learning (ML) algorithms are ubiquitous in key decision-making contexts such as organizational justice or healthcare, which has spawned a great demand for fairness in these procedures. In this paper we focus on the application of fair ML in finance, more concretely on the use of fairness techniques on credit scoring. This paper makes two contributions. On the one hand, it addresses the existent gap concerning the application of established methods in the literature to the case of multiple sensitive variables through the use of a new technique called logical processors(LP). On the other hand, it also explores the novel method of multistage processors (MP) to investigate whether the combination of fairness methods can work synergistically to produce solutions with improved fairness or accuracy. Furthermore, we examine the intersection of these two lines of research by exploring the integration of fairness methods in the multivariate case. The results are very promising and suggest that logical processors are an appropriate way of handling multiple sensitive variables. Furthermore, multistage processors are capable of improving the performance of existing methods. Introduction In the last decades, institutions have been increasingly relying on artificial intelligence (AI) and machine learning (ML) to aid in decision-making. Furthermore, the interplay between discrimination and calibration suggests that building a model avoiding spurious relationships between variables may increase reliability [5]. This paper will focus on the application of fair ML models in a financial context to address the problem of credit scoring, which plays a key role in loan approval [6]. Although a plethora of metrics and models have been proposed in the literature for bias mitigation, there are still many open challenges surrounding this topic. More concretely, this work is interested in exploring two particular research gaps. On the one hand, there is a demand for methods that handle multiple sensitive variables both from ethical and legal frameworks [7]. Furthermore, there are concerns about the unique discrimination that some individuals experience due to their belonging to the intersection of protected groups [8].