xauc
- North America > United States > Missouri > Jackson County > Kansas City (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
- Law (1.00)
- Banking & Finance (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)
- Government (0.68)
TranSUN: A Preemptive Paradigm to Eradicate Retransformation Bias Intrinsically from Regression Models in Recommender Systems
Yu, Jiahao, Liu, Haozhuang, Yang, Yeqiu, Chen, Lu, Wu, Jian, Jiang, Yuning, Zheng, Bo
Regression models are crucial in recommender systems. However, retransformation bias problem has been conspicuously neglected within the community. While many works in other fields have devised effective bias correction methods, all of them are post-hoc cures externally to the model, facing practical challenges when applied to real-world recommender systems. Hence, we propose a preemptive paradigm to eradicate the bias intrinsically from the models via minor model refinement. Specifically, a novel TranSUN method is proposed with a joint bias learning manner to offer theoretically guaranteed unbiasedness under empirical superior convergence. It is further generalized into a novel generic regression model family, termed Generalized TranSUN (GTS), which not only offers more theoretical insights but also serves as a generic framework for flexibly developing various bias-free models. Comprehensive experimental results demonstrate the superiority of our methods across data from various domains, which have been successfully deployed in two real-world industrial recommendation scenarios, i.e. product and short video recommendation scenarios in Guess What You Like business domain in the homepage of Taobao App (a leading e-commerce platform with DAU > 300M), to serve the major online traffic.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Singapore > Central Region > Singapore (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- North America > United States > Missouri > Jackson County > Kansas City (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
- Law (1.00)
- Banking & Finance (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)
FairPOT: Balancing AUC Performance and Fairness with Proportional Optimal Transport
Liu, Pengxi, Shen, Yi, Engelhard, Matthew M., Goldstein, Benjamin A., Pencina, Michael J., Economou-Zavlanos, Nicoleta J., Zavlanos, Michael M.
Fairness metrics utilizing the area under the receiver operator characteristic curve (AUC) have gained increasing attention in high-stakes domains such as healthcare, finance, and criminal justice. In these domains, fairness is often evaluated over risk scores rather than binary outcomes, and a common challenge is that enforcing strict fairness can significantly degrade AUC performance. To address this challenge, we propose Fair Proportional Optimal Transport (FairPOT), a novel, model-agnostic post-processing framework that strategically aligns risk score distributions across different groups using optimal transport, but does so selectively by transforming a controllable proportion, i.e., the top-lambda quantile, of scores within the disadvantaged group. By varying lambda, our method allows for a tunable trade-off between reducing AUC disparities and maintaining overall AUC performance. Furthermore, we extend FairPOT to the partial AUC setting, enabling fairness interventions to concentrate on the highest-risk regions. Extensive experiments on synthetic, public, and clinical datasets show that FairPOT consistently outperforms existing post-processing techniques in both global and partial AUC scenarios, often achieving improved fairness with slight AUC degradation or even positive gains in utility. The computational efficiency and practical adaptability of FairPOT make it a promising solution for real-world deployment.
- Research Report > Promising Solution (0.68)
- Research Report > New Finding (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Data Science > Data Mining (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Threshold-Independent Fair Matching through Score Calibration
Moslemi, Mohammad Hossein, Milani, Mostafa
Entity Matching (EM) is a critical task in numerous fields, such as healthcare, finance, and public administration, as it identifies records that refer to the same entity within or across different databases. EM faces considerable challenges, particularly with false positives and negatives. These are typically addressed by generating matching scores and apply thresholds to balance false positives and negatives in various contexts. However, adjusting these thresholds can affect the fairness of the outcomes, a critical factor that remains largely overlooked in current fair EM research. The existing body of research on fair EM tends to concentrate on static thresholds, neglecting their critical impact on fairness. To address this, we introduce a new approach in EM using recent metrics for evaluating biases in score based binary classification, particularly through the lens of distributional parity. This approach enables the application of various bias metrics like equalized odds, equal opportunity, and demographic parity without depending on threshold settings. Our experiments with leading matching methods reveal potential biases, and by applying a calibration technique for EM scores using Wasserstein barycenters, we not only mitigate these biases but also preserve accuracy across real world datasets. This paper contributes to the field of fairness in data cleaning, especially within EM, which is a central task in data cleaning, by promoting a method for generating matching scores that reduce biases across different thresholds.
- North America > Canada > Ontario > Middlesex County > London (0.14)
- North America > United States > District of Columbia > Washington (0.05)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
Bipartite Ranking Fairness through a Model Agnostic Ordering Adjustment
Cui, Sen, Pan, Weishen, Zhang, Changshui, Wang, Fei
Algorithmic fairness has been a serious concern and received lots of interest in machine learning community. In this paper, we focus on the bipartite ranking scenario, where the instances come from either the positive or negative class and the goal is to learn a ranking function that ranks positive instances higher than negative ones. While there could be a trade-off between fairness and performance, we propose a model agnostic post-processing framework xOrder for achieving fairness in bipartite ranking and maintaining the algorithm classification performance. In particular, we optimize a weighted sum of the utility as identifying an optimal warping path across different protected groups and solve it through a dynamic programming process. xOrder is compatible with various classification models and ranking fairness metrics, including supervised and unsupervised fairness metrics. In addition to binary groups, xOrder can be applied to multiple protected groups. We evaluate our proposed algorithm on four benchmark data sets and two real-world patient electronic health record repositories. xOrder consistently achieves a better balance between the algorithm utility and ranking fairness on a variety of datasets with different metrics. From the visualization of the calibrated ranking scores, xOrder mitigates the score distribution shifts of different groups compared with baselines. Moreover, additional analytical results verify that xOrder achieves a robust performance when faced with fewer samples and a bigger difference between training and testing ranking score distributions.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > New York (0.04)
- North America > United States > Oregon > Lane County > Eugene (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Research Report > Experimental Study (0.45)
- Research Report > New Finding (0.45)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Health Care Technology > Medical Record (0.54)
Disparate Censorship & Undertesting: A Source of Label Bias in Clinical Machine Learning
Chang, Trenton, Sjoding, Michael W., Wiens, Jenna
As machine learning (ML) models gain traction in clinical applications, understanding the impact of clinician and societal biases on ML models is increasingly important. While biases can arise in the labels used for model training, the many sources from which these biases arise are not yet well-studied. In this paper, we highlight disparate censorship (i.e., differences in testing rates across patient groups) as a source of label bias that clinical ML models may amplify, potentially causing harm. Many patient risk-stratification models are trained using the results of clinician-ordered diagnostic and laboratory tests of labels. Patients without test results are often assigned a negative label, which assumes that untested patients do not experience the outcome. Since orders are affected by clinical and resource considerations, testing may not be uniform in patient populations, giving rise to disparate censorship. Disparate censorship in patients of equivalent risk leads to undertesting in certain groups, and in turn, more biased labels for such groups. Using such biased labels in standard ML pipelines could contribute to gaps in model performance across patient groups. Here, we theoretically and empirically characterize conditions in which disparate censorship or undertesting affect model performance across subgroups. Our findings call attention to disparate censorship as a source of label bias in clinical ML models.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- Asia > Middle East > Israel (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
xOrder: A Model Agnostic Post-Processing Framework for Achieving Ranking Fairness While Maintaining Algorithm Utility
Cui, Sen, Pan, Weishen, Zhang, Changshui, Wang, Fei
Algorithmic fairness has received lots of interests in machine learning recently. In this paper, we focus on the bipartite ranking scenario, where the instances come from either the positive or negative class and the goal is to learn a ranking function that ranks positive instances higher than negative ones. In an unfair setting, the probabilities of ranking the positives higher than negatives are different across different protected groups. We propose a general post-processing framework, xOrder, for achieving fairness in bipartite ranking while maintaining the algorithm classification performance. In particular, we optimize a weighted sum of the utility and fairness by directly adjusting the relative ordering across groups. We formulate this problem as identifying an optimal warping path across {different} protected groups and solve it through a dynamic programming process. xOrder is compatible with various classification models and applicable to a variety of ranking fairness metrics. We evaluate our proposed algorithm on four benchmark data sets and two real world patient electronic health record repository. The experimental results show that our approach can achieve great balance between the algorithm utility and ranking fairness. Our algorithm can also achieve robust performance when training and testing ranking score distributions are significantly different.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Oregon > Lane County > Eugene (0.04)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Health Care Technology > Medical Record (0.54)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
The Fairness of Risk Scores Beyond Classification: Bipartite Ranking and the xAUC Metric
Where machine-learned predictive risk scores inform high-stakes decisions, such as bail and sentencing in criminal justice, fairness has been a serious concern. Recent work has characterized the disparate impact that such risk scores can have when used for a binary classification task and provided tools to audit and adjust resulting classifiers. This may not account, however, for the more diverse downstream uses of risk scores and their non-binary nature. To better account for this, in this paper, we investigate the fairness of predictive risk scores from the point of view of a bipartite ranking task, where one seeks to rank positive examples higher than negative ones. We introduce the xAUC disparity as a metric to assess the disparate impact of risk scores and define it as the difference in the probabilities of ranking a random positive example from one protected group above a negative one from another group and vice versa. We provide a decomposition of bipartite ranking loss into components that involve the discrepancy and components that involve pure predictive ability within each group. We further provide an interpretation of the xAUC discrepancy in terms of resource allocation fairness and make connections to existing fairness metrics and adjustments. We assess xAUC empirically on datasets in recidivism prediction, income prediction, and cardiac arrest prediction, where it describes disparities that are not evident from simply comparing within-group predictive performance.
- North America > United States > Missouri > Jackson County > Kansas City (0.04)
- North America > United States > California (0.04)
- Law (1.00)
- Banking & Finance (0.93)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.48)