Goto

Collaborating Authors

 Gupta, Vivek


Equalizing Recourse across Groups

arXiv.org Artificial Intelligence

The rise in machine learning-assisted decision-making has led to concerns about the fairness of the decisions and techniques to mitigate problems of discrimination. If a negative decision is made about an individual (denying a loan, rejecting an application for housing, and so on) justice dictates that we be able to ask how we might change circumstances to get a favorable decision the next time. Moreover, the ability to change circumstances (a better education, improved credentials) should not be limited to only those with access to expensive resources. In other words, \emph{recourse} for negative decisions should be considered a desirable value that can be equalized across (demographically defined) groups. This paper describes how to build models that make accurate predictions while still ensuring that the penalties for a negative outcome do not disadvantage different groups disproportionately. We measure recourse as the distance of an individual from the decision boundary of a classifier. We then introduce a regularized objective to minimize the difference in recourse across groups. We explore linear settings and further extend recourse to non-linear settings as well as model-agnostic settings where the exact distance from boundary cannot be calculated. Our results show that we can successfully decrease the unfairness in recourse while maintaining classifier performance.


Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

arXiv.org Machine Learning

For many applications, an ensemble of base classifiers is an effective solution. The tuning of its parameters(number of classes, amount of data on which each classifier is to be trained on, etc.) requires G, the generalization error of a given ensemble. The efficient estimation of G is the focus of this paper. The key idea is to approximate the variance of the class scores/probabilities of the base classifiers over the randomness imposed by the training subset by normal/beta distribution at each point x in the input feature space. We estimate the parameters of the distribution using a small set of randomly chosen base classifiers and use those parameters to give efficient estimation schemes for G. We give empirical evidence for the quality of the various estimators. We also demonstrate their usefulness in making design choices such as the number of classifiers in the ensemble and the size of a subset of data used for training that is needed to achieve a certain value of generalization error. Our approach also has great potential for designing distributed ensemble classifiers.