Hu, François
EquiPy: Sequential Fairness using Optimal Transport in Python
Machado, Agathe Fernandes, Grondin, Suzie, Ratz, Philipp, Charpentier, Arthur, Hu, François
Algorithmic fairness has received considerable attention due to the failures of various predictive AI systems that have been found to be unfairly biased against subgroups of the population. Many approaches have been proposed to mitigate such biases in predictive systems, however, they often struggle to provide accurate estimates and transparent correction mechanisms in the case where multiple sensitive variables, such as a combination of gender and race, are involved. This paper introduces a new open source Python package, EquiPy, which provides a easy-to-use and model agnostic toolbox for efficiently achieving fairness across multiple sensitive variables. It also offers comprehensive graphic utilities to enable the user to interpret the influence of each sensitive variable within a global context. EquiPy makes use of theoretical results that allow the complexity arising from the use of multiple variables to be broken down into easier-to-solve sub-problems. We demonstrate the ease of use for both mitigation and interpretation on publicly available data derived from the US Census and provide sample code for its use.
From Uncertainty to Precision: Enhancing Binary Classifier Performance through Calibration
Machado, Agathe Fernandes, Charpentier, Arthur, Flachaire, Emmanuel, Gallic, Ewen, Hu, François
Binary classification tasks are prevalent in learning algorithms, as diverse scenarios require binary decisions. Examples include predicting default risk or accident occurrence in insurance or finance as well as disease likelihood in healthcare. To improve reliability, particularly in sensitive decision-making contexts, a classifier must possess strong discriminatory capabilities. Typically, classifiers are trained to optimize goodness-of-fit criteria, often based on the accuracy of class predictions. However, goodness-of-fit criteria, such as accuracy or AUC, do not consider the varying confidence levels assigned by the algorithm to each prediction. If the sole objective is effective class prediction, then the classifier fulfills its purpose. Nevertheless, there are instances where interest extends beyond the predicted class to the associated likelihood. This occurs when predicting loan repayment defaults (Liu et al., 2021) or accident incidences, as risk transfer pricing is usually tied directly to event probabilities. In such cases, the model-predicted scores of classifiers are often interpreted as event probabilities.
Geospatial Disparities: A Case Study on Real Estate Prices in Paris
Machado, Agathe Fernandes, Hu, François, Ratz, Philipp, Gallic, Ewen, Charpentier, Arthur
Driven by an increasing prevalence of trackers, ever more IoT sensors, and the declining cost of computing power, geospatial information has come to play a pivotal role in contemporary predictive models. While enhancing prognostic performance, geospatial data also has the potential to perpetuate many historical socio-economic patterns, raising concerns about a resurgence of biases and exclusionary practices, with their disproportionate impacts on society. Addressing this, our paper emphasizes the crucial need to identify and rectify such biases and calibration errors in predictive models, particularly as algorithms become more intricate and less interpretable. The increasing granularity of geospatial information further introduces ethical concerns, as choosing different geographical scales may exacerbate disparities akin to redlining and exclusionary zoning. To address these issues, we propose a toolkit for identifying and mitigating biases arising from geospatial data. Extending classical fairness definitions, we incorporate an ordinal regression case with spatial attributes, deviating from the binary classification focus. This extension allows us to gauge disparities stemming from data aggregation levels and advocates for a less interfering correction approach. Illustrating our methodology using a Parisian real estate dataset, we showcase practical applications and scrutinize the implications of choosing geographical aggregation levels for fairness and calibration measures.
Fairness Explainability using Optimal Transport with Applications in Image Classification
Ratz, Philipp, Hu, François, Charpentier, Arthur
Ensuring trust and accountability in Artificial Intelligence systems demands explainability of its outcomes. Despite significant progress in Explainable AI, human biases still taint a substantial portion of its training data, raising concerns about unfairness or discriminatory tendencies. Current approaches in the field of Algorithmic Fairness focus on mitigating such biases in the outcomes of a model, but few attempts have been made to try to explain \emph{why} a model is biased. To bridge this gap between the two fields, we propose a comprehensive approach that uses optimal transport theory to uncover the causes of discrimination in Machine Learning applications, with a particular emphasis on image classification. We leverage Wasserstein barycenters to achieve fair predictions and introduce an extension to pinpoint bias-associated regions. This allows us to derive a cohesive system which uses the enforced fairness to measure each features influence \emph{on} the bias. Taking advantage of this interplay of enforcing and explaining fairness, our method hold significant implications for the development of trustworthy and unbiased AI systems, fostering transparency, accountability, and fairness in critical decision-making scenarios across diverse domains.
A Sequentially Fair Mechanism for Multiple Sensitive Attributes
Hu, François, Ratz, Philipp, Charpentier, Arthur
In the standard use case of Algorithmic Fairness, the goal is to eliminate the relationship between a sensitive variable and a corresponding score. Throughout recent years, the scientific community has developed a host of definitions and tools to solve this task, which work well in many practical applications. However, the applicability and effectivity of these tools and definitions becomes less straightfoward in the case of multiple sensitive attributes. To tackle this issue, we propose a sequential framework, which allows to progressively achieve fairness across a set of sensitive features. We accomplish this by leveraging multi-marginal Wasserstein barycenters, which extends the standard notion of Strong Demographic Parity to the case with multiple sensitive characteristics. This method also provides a closed-form solution for the optimal, sequentially fair predictor, permitting a clear interpretation of inter-sensitive feature correlations. Our approach seamlessly extends to approximate fairness, enveloping a framework accommodating the trade-off between risk and unfairness. This extension permits a targeted prioritization of fairness improvements for a specific attribute within a set of sensitive attributes, allowing for a case specific adaptation. A data-driven estimation procedure for the derived solution is developed, and comprehensive numerical experiments are conducted on both synthetic and real datasets. Our empirical findings decisively underscore the practical efficacy of our post-processing approach in fostering fair decision-making.
Fairness in Multi-Task Learning via Wasserstein Barycenters
Hu, François, Ratz, Philipp, Charpentier, Arthur
Algorithmic Fairness is an established field in machine learning that aims to reduce biases in data. Recent advances have proposed various methods to ensure fairness in a univariate environment, where the goal is to de-bias a single task. However, extending fairness to a multi-task setting, where more than one objective is optimised using a shared representation, remains underexplored. To bridge this gap, we develop a method that extends the definition of Strong Demographic Parity to multi-task learning using multi-marginal Wasserstein barycenters. Our approach provides a closed form solution for the optimal fair multi-task predictor including both regression and binary classification tasks. We develop a data-driven estimation procedure for the solution and run numerical experiments on both synthetic and real datasets. The empirical results highlight the practical value of our post-processing methodology in promoting fair decision-making.
Mitigating Discrimination in Insurance with Wasserstein Barycenters
Charpentier, Arthur, Hu, François, Ratz, Philipp
The insurance industry is heavily reliant on predictions of risks based on characteristics of potential customers. Although the use of said models is common, researchers have long pointed out that such practices perpetuate discrimination based on sensitive features such as gender or race. Given that such discrimination can often be attributed to historical data biases, an elimination or at least mitigation is desirable. With the shift from more traditional models to machine-learning based predictions, calls for greater mitigation have grown anew, as simply excluding sensitive variables in the pricing process can be shown to be ineffective. In this article, we first investigate why predictions are a necessity within the industry and why correcting biases is not as straightforward as simply identifying a sensitive variable. We then propose to ease the biases through the use of Wasserstein barycenters instead of simple scaling. To demonstrate the effects and effectiveness of the approach we employ it on real data and discuss its implications.
An overview of active learning methods for insurance with fairness appreciation
Elie, Romuald, Hillairet, Caroline, Hu, François, Juillard, Marc
This paper addresses and solves some challenges in the adoption of machine learning in insurance with the democratization of model deployment. The first challenge is reducing the labelling effort (hence focusing on the data quality) with the help of active learning, a feedback loop between the model inference and an oracle: as in insurance the unlabeled data is usually abundant, active learning can become a significant asset in reducing the labelling cost. For that purpose, this paper sketches out various classical active learning methodologies before studying their empirical impact on both synthetic and real datasets. Another key challenge in insurance is the fairness issue in model inferences. We will introduce and integrate a post-processing fairness for multi-class tasks in this active learning framework to solve these two issues. Finally numerical experiments on unfair datasets highlight that the proposed setup presents a good compromise between model precision and fairness.