Keswani, Vijay
Can AI Model the Complexities of Human Moral Decision-Making? A Qualitative Study of Kidney Allocation Decisions
Keswani, Vijay, Conitzer, Vincent, Sinnott-Armstrong, Walter, Nguyen, Breanna K., Heidari, Hoda, Borg, Jana Schaich
A growing body of work in Ethical AI attempts to capture human moral judgments through simple computational models. The key question we address in this work is whether such simple AI models capture {the critical} nuances of moral decision-making by focusing on the use case of kidney allocation. We conducted twenty interviews where participants explained their rationale for their judgments about who should receive a kidney. We observe participants: (a) value patients' morally-relevant attributes to different degrees; (b) use diverse decision-making processes, citing heuristics to reduce decision complexity; (c) can change their opinions; (d) sometimes lack confidence in their decisions (e.g., due to incomplete information); and (e) express enthusiasm and concern regarding AI assisting humans in kidney allocation decisions. Based on these findings, we discuss challenges of computationally modeling moral judgments {as a stand-in for human input}, highlight drawbacks of current approaches, and suggest future directions to address these issues.
Fair Classification with Partial Feedback: An Exploration-Based Data-Collection Approach
Keswani, Vijay, Mehrotra, Anay, Celis, L. Elisa
In many predictive contexts (e.g., credit lending), true outcomes are only observed for samples that were positively classified in the past. These past observations, in turn, form training datasets for classifiers that make future predictions. However, such training datasets lack information about the outcomes of samples that were (incorrectly) negatively classified in the past and can lead to erroneous classifiers. We present an approach that trains a classifier using available data and comes with a family of exploration strategies to collect outcome data about subpopulations that otherwise would have been ignored. For any exploration strategy, the approach comes with guarantees that (1) all sub-populations are explored, (2) the fraction of false positives is bounded, and (3) the trained classifier converges to a "desired" classifier. The right exploration strategy is context-dependent; it can be chosen to improve learning guarantees and encode context-specific group fairness properties. Evaluation on real-world datasets shows that this approach consistently boosts the quality of collected outcome data and improves the fraction of true positives for all groups, with only a small reduction in predictive utility.
Addressing Strategic Manipulation Disparities in Fair Classification
Keswani, Vijay, Celis, L. Elisa
In real-world classification settings, such as loan application evaluation or content moderation on online platforms, individuals respond to classifier predictions by strategically updating their features to increase their likelihood of receiving a particular (positive) decision (at a certain cost). Yet, when different demographic groups have different feature distributions or pay different update costs, prior work has shown that individuals from minority groups often pay a higher cost to update their features. Fair classification aims to address such classifier performance disparities by constraining the classifiers to satisfy statistical fairness properties. However, we show that standard fairness constraints do not guarantee that the constrained classifier reduces the disparity in strategic manipulation cost. To address such biases in strategic settings and provide equal opportunities for strategic manipulation, we propose a constrained optimization framework that constructs classifiers that lower the strategic manipulation cost for minority groups. We develop our framework by studying theoretical connections between group-specific strategic cost disparity and standard selection rate fairness metrics (e.g., statistical rate and true positive rate). Empirically, we show the efficacy of this approach over multiple real-world datasets.
Designing Closed-Loop Models for Task Allocation
Keswani, Vijay, Celis, L. Elisa, Kenthapadi, Krishnaram, Lease, Matthew
Automatically assigning tasks to people is challenging because human performance can vary across tasks for many reasons. This challenge is further compounded in real-life settings in which no oracle exists to assess the quality of human decisions and task assignments made. Instead, we find ourselves in a "closed" decision-making loop in which the same fallible human decisions we rely on in practice must also be used to guide task allocation. How can imperfect and potentially biased human decisions train an accurate allocation model? Our key insight is to exploit weak prior information on human-task similarity to bootstrap model training. We show that the use of such a weak prior can improve task allocation accuracy, even when human decision-makers are fallible and biased. We present both theoretical analysis and empirical evaluation over synthetic data and a social media toxicity detection task. Results demonstrate the efficacy of our approach.
Auditing for Diversity using Representative Examples
Keswani, Vijay, Celis, L. Elisa
Assessing the diversity of a dataset of information associated with people is crucial before using such data for downstream applications. For a given dataset, this often involves computing the imbalance or disparity in the empirical marginal distribution of a protected attribute (e.g. gender, dialect, etc.). However, real-world datasets, such as images from Google Search or collections of Twitter posts, often do not have protected attributes labeled. Consequently, to derive disparity measures for such datasets, the elements need to hand-labeled or crowd-annotated, which are expensive processes. We propose a cost-effective approach to approximate the disparity of a given unlabeled dataset, with respect to a protected attribute, using a control set of labeled representative examples. Our proposed algorithm uses the pairwise similarity between elements in the dataset and elements in the control set to effectively bootstrap an approximation to the disparity of the dataset. Importantly, we show that using a control set whose size is much smaller than the size of the dataset is sufficient to achieve a small approximation error. Further, based on our theoretical framework, we also provide an algorithm to construct adaptive control sets that achieve smaller approximation errors than randomly chosen control sets. Simulations on two image datasets and one Twitter dataset demonstrate the efficacy of our approach (using random and adaptive control sets) in auditing the diversity of a wide variety of datasets.
Towards Unbiased and Accurate Deferral to Multiple Experts
Keswani, Vijay, Lease, Matthew, Kenthapadi, Krishnaram
Machine learning models are often implemented in cohort with humans in the pipeline, with the model having an option to defer to a domain expert in cases where it has low confidence in its inference. Our goal is to design mechanisms for ensuring accuracy and fairness in such prediction systems that combine machine learning model inferences and domain expert predictions. Prior work on "deferral systems" in classification settings has focused on the setting of a pipeline with a single expert and aimed to accommodate the inaccuracies and biases of this expert to simultaneously learn an inference model and a deferral system. Our work extends this framework to settings where multiple experts are available, with each expert having their own domain of expertise and biases. We propose a framework that simultaneously learns a classifier and a deferral system, with the deferral system choosing to defer to one or more human experts in cases of input where the classifier has low confidence. We test our framework on a synthetic dataset and a content moderation dataset with biased synthetic experts, and show that it significantly improves the accuracy and fairness of the final predictions, compared to the baselines. We also collect crowdsourced labels for the content moderation task to construct a real-world dataset for the evaluation of hybrid machine-human frameworks and show that our proposed learning framework outperforms baselines on this real-world dataset as well.
Fair Distributions from Biased Samples: A Maximum Entropy Optimization Framework
Celis, L. Elisa, Keswani, Vijay, Yildiz, Ozan, Vishnoi, Nisheeth K.
One reason for the emergence of bias in AI systems is biased data -- datasets that may not be true representations of the underlying distributions -- and may over or under-represent groups with respect to protected attributes such as gender or race. We consider the problem of correcting such biases and learning distributions that are "fair", with respect to measures such as proportional representation and statistical parity, from the given samples. Our approach is based on a novel formulation of the problem of learning a fair distribution as a maximum entropy optimization problem with a given expectation vector and a prior distribution. Technically, our main contributions are: (1) a new second-order method to compute the (dual of the) maximum entropy distribution over an exponentially-sized discrete domain that turns out to be faster than previous methods, and (2) methods to construct prior distributions and expectation vectors that provably guarantee that the learned distributions satisfy a wide class of fairness criteria. Our results also come with quantitative bounds on the total variation distance between the empirical distribution obtained from the samples and the learned fair distribution. Our experimental results include testing our approach on the COMPAS dataset and showing that the fair distributions not only improve disparate impact values but when used to train classifiers only incur a small loss of accuracy.
Implicit Diversity in Image Summarization
Celis, L. Elisa, Keswani, Vijay
Case studies, such as Kay et al., 2015 have shown that in image summarization, such as with Google Image Search, the people in the results presented for occupations are more imbalanced with respect to sensitive attributes such as gender and ethnicity than the ground truth. Most of the existing approaches to correct for this problem in image summarization assume that the images are labelled and use the labels for training the model and correcting for biases. However, these labels may not always be present. Furthermore, it is often not possible (nor even desirable) to automatically classify images by sensitive attributes such as gender or race. Moreover, balancing according to the labels does not guarantee that the diversity will be visibly apparent - arguably the only metric that matters when selecting diverse images. We develop a novel approach that takes as input a visibly diverse control set of images and uses this set to produce images in response to a query which is similarly visibly diverse. We implement this approach using pre-trained and modified Convolutional Neural Networks like VGG-16, and evaluate our approach empirically on the Image dataset compiled and used by Kay et al., 2015. We compare our results with the Google Image Search results from Kay et al., 2015 and natural baselines and observe that our algorithm produces images that are accurate with respect to their similarity to the query images (on par with that of the Google Image Search results), but significantly outperforms with respect to visible diversity as measured by their similarity to our diverse control set.
Improved Adversarial Learning for Fair Classification
Celis, L. Elisa, Keswani, Vijay
Motivated by concerns that machine learning algorithms may introduce significant bias in classification models, developing fair classifiers has become an important problem in machine learning research. One important paradigm towards this has been providing algorithms for adversarially learning fair classifiers (Zhang et al., 2018; Madras et al., 2018). We formulate the adversarial learning problem as a multi-objective optimization problem and find the fair model using gradient descent-ascent algorithm with a modified gradient update step, inspired by the approach of Zhang et al., 2018. We provide theoretical insight and guarantees that formalize the heuristic arguments presented previously towards taking such an approach. We test our approach empirically on the Adult dataset and synthetic datasets and compare against state of the art algorithms (Celis et al., 2018; Zhang et al., 2018; Zafar et al., 2017). The results show that our models and algorithms have comparable or better accuracy than other algorithms while performing better in terms of fairness, as measured using statistical rate or false discovery rate.
Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees
Celis, L. Elisa, Huang, Lingxiao, Keswani, Vijay, Vishnoi, Nisheeth K.
Developing classification algorithms that are fair with respect to sensitive attributes of the data has become an important problem due to the growing deployment of classification algorithms in various social contexts. Several recent works have focused on fairness with respect to a specific metric, modeled the corresponding fair classification problem as a constrained optimization problem, and developed tailored algorithms to solve them. Despite this, there still remain important metrics for which we do not have fair classifiers and many of the aforementioned algorithms do not come with theoretical guarantees; perhaps because the resulting optimization problem is non-convex. The main contribution of this paper is a new meta-algorithm for classification that takes as input a large class of fairness constraints, with respect to multiple non-disjoint sensitive attributes, and which comes with provable guarantees. This is achieved by first developing a meta-algorithm for a large family of classification problems with convex constraints, and then showing that classification problems with general types of fairness constraints can be reduced to those in this family. We present empirical results that show that our algorithm can achieve near-perfect fairness with respect to various fairness metrics, and that the loss in accuracy due to the imposed fairness constraints is often small. Overall, this work unifies several prior works on fair classification, presents a practical algorithm with theoretical guarantees, and can handle fairness metrics that were previously not possible.