fair regression
Fair regression via plug-in estimator and recalibration with statistical guarantees
We study the problem of learning an optimal regression function subject to a fairness constraint. It requires that, conditionally on the sensitive feature, the distribution of the function output remains the same. This constraint naturally extends the notion of demographic parity, often used in classification, to the regression setting. We tackle this problem by leveraging on a proxy-discretized version, for which we derive an explicit expression of the optimal fair predictor. This result naturally suggests a two stage approach, in which we first estimate the (unconstrained) regression function from a set of labeled data and then we recalibrate it with another set of unlabeled data. The recalibration step can be efficiently performed via a smooth optimization. We derive rates of convergence of the proposed estimator to the optimal fair predictor both in terms of the risk and fairness constraint. Finally, we present numerical experiments illustrating that the proposed method is often superior or competitive with state-of-the-art methods.
Equal Opportunity of Coverage in Fair Regression
We study fair machine learning (ML) under predictive uncertainty to enable reliable and trustworthy decision-making. The seminal work of'equalized coverage' proposed an uncertainty-aware fairness notion. However, it does not guarantee equal coverage rates across more fine-grained groups (e.g., low-income females) conditioning on the true label and is biased in the assessment of uncertainty. To tackle these limitations, we propose a new uncertainty-aware fairness -- Equal Opportunity of Coverage (EOC) -- that aims to achieve two properties: (1) coverage rates for different groups with similar outcomes are close, and (2) the coverage rate for the entire population remains at a predetermined level. Further, the prediction intervals should be narrow to be informative. We propose Binned Fair Quantile Regression (BFQR), a distribution-free post-processing method to improve EOC with reasonable width for any trained ML models. It first calibrates a hold-out set to bound deviation from EOC, then leverages conformal prediction to maintain EOC on a test set, meanwhile optimizing prediction interval width. Experimental results demonstrate the effectiveness of our method in improving EOC.
Fair regression with Wasserstein barycenters
We study the problem of learning a real-valued function that satisfies the Demographic Parity constraint. It demands the distribution of the predicted output to be independent of the sensitive attribute. We consider the case that the sensitive attribute is available for prediction. We establish a connection between fair regression and optimal transport theory, based on which we derive a close form expression for the optimal fair predictor. Specifically, we show that the distribution of this optimum is the Wasserstein barycenter of the distributions induced by the standard regression function on the sensitive groups. This result offers an intuitive interpretation of the optimal fair prediction and suggests a simple post-processing algorithm to achieve fairness. We establish risk and distribution-free fairness guarantees for this procedure. Numerical experiments indicate that our method is very effective in learning fair models, with a relative increase in error rate that is inferior to the relative gain in fairness.
Liouville PDE-based sliced-Wasserstein flow for fair regression
The sliced Wasserstein flow (SWF), a nonparametric and implicit generative gradient flow, is applied to fair regression. We have improved the SWF in a few aspects. First, the stochastic diffusive term from the Fokker-Planck equation-based Monte Carlo is transformed to Liouville partial differential equation (PDE)-based transport with density estimation, however, without the diffusive term. Now, the computation of the Wasserstein barycenter is approximated by the SWF barycenter with the prescription of Kantorovich potentials for the induced gradient flow to generate its samples. These two efforts improve the convergence in training and testing SWF and SWF barycenters with reduced variance. Applying the generative SWF barycenter for fair regression demonstrates competent profiles in the accuracy-fairness Pareto curves.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
Review for NeurIPS paper: Fair regression via plug-in estimator and recalibration with statistical guarantees
Summary and Contributions: This paper provides a new algorithm to train a regression function subject to a demographic parity like fairness constraint. The proposed approach constructs a plug-in estimator by first training an unconstrained regression function using labeled data and calibrate the model to satisfy the fairness constraint using unlabeled data. The final model is a "regression function with discrete outputs". The authors show convergence rates to the optimal fair regression model, and demonstrate competitive empirical performance compared to previous approaches for fair regression. I'm still of the opinion that the technical gap I pointed out is an important one, and that the analysis would have been much more complete and satisfying had the guarantees for the optimization algorithm been on the gradients of the dual objective.
Review for NeurIPS paper: Fair regression with Wasserstein barycenters
Weaknesses: My biggest worry is that I'm not sure whether this work adds significantly new contributions compared to the previous literature that uses optimal transport theory for fair classification. It seems like it's the modification of the post-processing approach in "Wasserstein Fair Classification" (Jiang et al). I would be happy to increase the score if the authors can highlight some challenges faced in updating approaches from previous work to this regression problem and how these challenges are not trivial. I wish there was a little more discussion about looking at these fairness constrained optimization problems through the lens of optimal transport theory; the paper only considered demographic parity, but maybe a discussion of why it is or is not immediate this approach may work for other fairness notions, such as equalized odds (appropriately're-defined' for the regression problem). Also, I wish whether it's possible to allow for some slack when considering demographic parity (difference can be at most some epsilon).
Fair regression via plug-in estimator and recalibration with statistical guarantees
We study the problem of learning an optimal regression function subject to a fairness constraint. It requires that, conditionally on the sensitive feature, the distribution of the function output remains the same. This constraint naturally extends the notion of demographic parity, often used in classification, to the regression setting. We tackle this problem by leveraging on a proxy-discretized version, for which we derive an explicit expression of the optimal fair predictor. This result naturally suggests a two stage approach, in which we first estimate the (unconstrained) regression function from a set of labeled data and then we recalibrate it with another set of unlabeled data.
Fair regression with Wasserstein barycenters
We study the problem of learning a real-valued function that satisfies the Demographic Parity constraint. It demands the distribution of the predicted output to be independent of the sensitive attribute. We consider the case that the sensitive attribute is available for prediction. We establish a connection between fair regression and optimal transport theory, based on which we derive a close form expression for the optimal fair predictor. Specifically, we show that the distribution of this optimum is the Wasserstein barycenter of the distributions induced by the standard regression function on the sensitive groups.
Equal Opportunity of Coverage in Fair Regression
We study fair machine learning (ML) under predictive uncertainty to enable reliable and trustworthy decision-making. The seminal work of'equalized coverage' proposed an uncertainty-aware fairness notion. However, it does not guarantee equal coverage rates across more fine-grained groups (e.g., low-income females) conditioning on the true label and is biased in the assessment of uncertainty. To tackle these limitations, we propose a new uncertainty-aware fairness -- Equal Opportunity of Coverage (EOC) -- that aims to achieve two properties: (1) coverage rates for different groups with similar outcomes are close, and (2) the coverage rate for the entire population remains at a predetermined level. Further, the prediction intervals should be narrow to be informative.
Fair Regression under Sample Selection Bias
Du, Wei, Wu, Xintao, Tong, Hanghang
Recent research on fair regression focused on developing new fairness notions and approximation methods as target variables and even the sensitive attribute are continuous in the regression setting. However, all previous fair regression research assumed the training data and testing data are drawn from the same distributions. This assumption is often violated in real world due to the sample selection bias between the training and testing data. In this paper, we develop a framework for fair regression under sample selection bias when dependent variable values of a set of samples from the training data are missing as a result of another hidden process. Our framework adopts the classic Heckman model for bias correction and the Lagrange duality to achieve fairness in regression based on a variety of fairness notions. Heckman model describes the sample selection process and uses a derived variable called the Inverse Mills Ratio (IMR) to correct sample selection bias. We use fairness inequality and equality constraints to describe a variety of fairness notions and apply the Lagrange duality theory to transform the primal problem into the dual convex optimization. For the two popular fairness notions, mean difference and mean squared error difference, we derive explicit formulas without iterative optimization, and for Pearson correlation, we derive its conditions of achieving strong duality. We conduct experiments on three real-world datasets and the experimental results demonstrate the approach's effectiveness in terms of both utility and fairness metrics.
- North America > United States > Arkansas > Washington County > Fayetteville (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Education (0.46)
- Law (0.46)
- Information Technology (0.46)
- Information Technology > Enterprise Applications > Customer Relationship Management (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)