AITopics

doi: 10.1145/3706598.3714167

2503.0094

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Personal > Interview (0.67)

Industry: Health & Medicine > Therapeutic Area > Nephrology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

arXiv.org Machine LearningFeb-17-2024

Fair Classification with Partial Feedback: An Exploration-Based Data-Collection Approach

Keswani, Vijay, Mehrotra, Anay, Celis, L. Elisa

In many predictive contexts (e.g., credit lending), true outcomes are only observed for samples that were positively classified in the past. These past observations, in turn, form training datasets for classifiers that make future predictions. However, such training datasets lack information about the outcomes of samples that were (incorrectly) negatively classified in the past and can lead to erroneous classifiers. We present an approach that trains a classifier using available data and comes with a family of exploration strategies to collect outcome data about subpopulations that otherwise would have been ignored. For any exploration strategy, the approach comes with guarantees that (1) all sub-populations are explored, (2) the fraction of false positives is bounded, and (3) the trained classifier converges to a "desired" classifier. The right exploration strategy is context-dependent; it can be chosen to improve learning guarantees and encode context-specific group fairness properties. Evaluation on real-world datasets shows that this approach consistently boosts the quality of collected outcome data and improves the fraction of true positives for all groups, with only a small reduction in predictive utility.

algorithm 1, artificial intelligence, machine learning, (17 more...)

2402.11338

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry:

Energy > Oil & Gas > Upstream (0.75)
Banking & Finance > Credit (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Machine LearningSep-15-2023

Addressing Strategic Manipulation Disparities in Fair Classification

Keswani, Vijay, Celis, L. Elisa

In real-world classification settings, such as loan application evaluation or content moderation on online platforms, individuals respond to classifier predictions by strategically updating their features to increase their likelihood of receiving a particular (positive) decision (at a certain cost). Yet, when different demographic groups have different feature distributions or pay different update costs, prior work has shown that individuals from minority groups often pay a higher cost to update their features. Fair classification aims to address such classifier performance disparities by constraining the classifiers to satisfy statistical fairness properties. However, we show that standard fairness constraints do not guarantee that the constrained classifier reduces the disparity in strategic manipulation cost. To address such biases in strategic settings and provide equal opportunities for strategic manipulation, we propose a constrained optimization framework that constructs classifiers that lower the strategic manipulation cost for minority groups. We develop our framework by studying theoretical connections between group-specific strategic cost disparity and standard selection rate fairness metrics (e.g., statistical rate and true positive rate). Empirically, we show the efficacy of this approach over multiple real-world datasets.

artificial intelligence, classifier, machine learning, (19 more...)

2205.10842

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.48)

Industry:

Banking & Finance > Credit (1.00)
Education (0.93)
Transportation > Passenger (0.67)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Artificial IntelligenceMay-31-2023

Designing Closed-Loop Models for Task Allocation

Keswani, Vijay, Celis, L. Elisa, Kenthapadi, Krishnaram, Lease, Matthew

Automatically assigning tasks to people is challenging because human performance can vary across tasks for many reasons. This challenge is further compounded in real-life settings in which no oracle exists to assess the quality of human decisions and task assignments made. Instead, we find ourselves in a "closed" decision-making loop in which the same fallible human decisions we rely on in practice must also be used to guide task allocation. How can imperfect and potentially biased human decisions train an accurate allocation model? Our key insight is to exploit weak prior information on human-task similarity to bootstrap model training. We show that the use of such a weak prior can improve task allocation accuracy, even when human decision-makers are fallible and biased. We present both theoretical analysis and empirical evaluation over synthetic data and a social media toxicity detection task. Results demonstrate the efficacy of our approach.

annotator, data mining, machine learning, (21 more...)

2305.19864

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (0.42)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

arXiv.org Artificial IntelligenceJul-15-2021

Auditing for Diversity using Representative Examples

Keswani, Vijay, Celis, L. Elisa

Assessing the diversity of a dataset of information associated with people is crucial before using such data for downstream applications. For a given dataset, this often involves computing the imbalance or disparity in the empirical marginal distribution of a protected attribute (e.g. gender, dialect, etc.). However, real-world datasets, such as images from Google Search or collections of Twitter posts, often do not have protected attributes labeled. Consequently, to derive disparity measures for such datasets, the elements need to hand-labeled or crowd-annotated, which are expensive processes. We propose a cost-effective approach to approximate the disparity of a given unlabeled dataset, with respect to a protected attribute, using a control set of labeled representative examples. Our proposed algorithm uses the pairwise similarity between elements in the dataset and elements in the control set to effectively bootstrap an approximation to the disparity of the dataset. Importantly, we show that using a control set whose size is much smaller than the size of the dataset is sufficient to achieve a small approximation error. Further, based on our theoretical framework, we also provide an algorithm to construct adaptive control sets that achieve smaller approximation errors than randomly chosen control sets. Simulations on two image datasets and one Twitter dataset demonstrate the efficacy of our approach (using random and adaptive control sets) in auditing the diversity of a wide variety of datasets.

dataset, neural network, social media, (19 more...)

2107.07393

Genre: Research Report (0.50)

Industry:

Information Technology > Services (0.54)
Law (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Machine LearningFeb-25-2021

Towards Unbiased and Accurate Deferral to Multiple Experts

Keswani, Vijay, Lease, Matthew, Kenthapadi, Krishnaram

Machine learning models are often implemented in cohort with humans in the pipeline, with the model having an option to defer to a domain expert in cases where it has low confidence in its inference. Our goal is to design mechanisms for ensuring accuracy and fairness in such prediction systems that combine machine learning model inferences and domain expert predictions. Prior work on "deferral systems" in classification settings has focused on the setting of a pipeline with a single expert and aimed to accommodate the inaccuracies and biases of this expert to simultaneously learn an inference model and a deferral system. Our work extends this framework to settings where multiple experts are available, with each expert having their own domain of expertise and biases. We propose a framework that simultaneously learns a classifier and a deferral system, with the deferral system choosing to defer to one or more human experts in cases of input where the classifier has low confidence. We test our framework on a synthetic dataset and a content moderation dataset with biased synthetic experts, and show that it significantly improves the accuracy and fairness of the final predictions, compared to the baselines. We also collect crowdsourced labels for the content moderation task to construct a real-world dataset for the evaluation of hybrid machine-human frameworks and show that our proposed learning framework outperforms baselines on this real-world dataset as well.

accuracy, artificial intelligence, optimization problem, (19 more...)

2102.13004

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

arXiv.org Artificial IntelligenceJun-5-2019

Fair Distributions from Biased Samples: A Maximum Entropy Optimization Framework

Celis, L. Elisa, Keswani, Vijay, Yildiz, Ozan, Vishnoi, Nisheeth K.

One reason for the emergence of bias in AI systems is biased data -- datasets that may not be true representations of the underlying distributions -- and may over or under-represent groups with respect to protected attributes such as gender or race. We consider the problem of correcting such biases and learning distributions that are "fair", with respect to measures such as proportional representation and statistical parity, from the given samples. Our approach is based on a novel formulation of the problem of learning a fair distribution as a maximum entropy optimization problem with a given expectation vector and a prior distribution. Technically, our main contributions are: (1) a new second-order method to compute the (dual of the) maximum entropy distribution over an exponentially-sized discrete domain that turns out to be faster than previous methods, and (2) methods to construct prior distributions and expectation vectors that provably guarantee that the learned distributions satisfy a wide class of fairness criteria. Our results also come with quantitative bounds on the total variation distance between the empirical distribution obtained from the samples and the learned fair distribution. Our experimental results include testing our approach on the COMPAS dataset and showing that the fair distributions not only improve disparate impact values but when used to train classifiers only incur a small loss of accuracy.

law enforcement, max-entropy distribution, optimization problem, (20 more...)

1906.02164

Country:

Europe (0.92)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Law (0.67)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.82)

arXiv.org Machine LearningJan-29-2019

Implicit Diversity in Image Summarization

Celis, L. Elisa, Keswani, Vijay

Case studies, such as Kay et al., 2015 have shown that in image summarization, such as with Google Image Search, the people in the results presented for occupations are more imbalanced with respect to sensitive attributes such as gender and ethnicity than the ground truth. Most of the existing approaches to correct for this problem in image summarization assume that the images are labelled and use the labels for training the model and correcting for biases. However, these labels may not always be present. Furthermore, it is often not possible (nor even desirable) to automatically classify images by sensitive attributes such as gender or race. Moreover, balancing according to the labels does not guarantee that the diversity will be visibly apparent - arguably the only metric that matters when selecting diverse images. We develop a novel approach that takes as input a visibly diverse control set of images and uses this set to produce images in response to a query which is similarly visibly diverse. We implement this approach using pre-trained and modified Convolutional Neural Networks like VGG-16, and evaluate our approach empirically on the Image dataset compiled and used by Kay et al., 2015. We compare our results with the Google Image Search results from Kay et al., 2015 and natural baselines and observe that our algorithm produces images that are accurate with respect to their similarity to the query images (on par with that of the Google Image Search results), but significantly outperforms with respect to visible diversity as measured by their similarity to our diverse control set.

health & medicine, neural network, occupation, (20 more...)

1901.10265

Country:

North America > United States (0.14)
Europe > Switzerland (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Government (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.78)

arXiv.org Machine LearningJan-29-2019

Improved Adversarial Learning for Fair Classification

Celis, L. Elisa, Keswani, Vijay

Motivated by concerns that machine learning algorithms may introduce significant bias in classification models, developing fair classifiers has become an important problem in machine learning research. One important paradigm towards this has been providing algorithms for adversarially learning fair classifiers (Zhang et al., 2018; Madras et al., 2018). We formulate the adversarial learning problem as a multi-objective optimization problem and find the fair model using gradient descent-ascent algorithm with a modified gradient update step, inspired by the approach of Zhang et al., 2018. We provide theoretical insight and guarantees that formalize the heuristic arguments presented previously towards taking such an approach. We test our approach empirically on the Adult dataset and synthetic datasets and compare against state of the art algorithms (Celis et al., 2018; Zhang et al., 2018; Zafar et al., 2017). The results show that our models and algorithms have comparable or better accuracy than other algorithms while performing better in terms of fairness, as measured using statistical rate or false discovery rate.

algorithm, artificial intelligence, machine learning, (18 more...)

1901.10443

Country: Europe > Switzerland (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Artificial IntelligenceJun-15-2018

Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees

Celis, L. Elisa, Huang, Lingxiao, Keswani, Vijay, Vishnoi, Nisheeth K.

Developing classification algorithms that are fair with respect to sensitive attributes of the data has become an important problem due to the growing deployment of classification algorithms in various social contexts. Several recent works have focused on fairness with respect to a specific metric, modeled the corresponding fair classification problem as a constrained optimization problem, and developed tailored algorithms to solve them. Despite this, there still remain important metrics for which we do not have fair classifiers and many of the aforementioned algorithms do not come with theoretical guarantees; perhaps because the resulting optimization problem is non-convex. The main contribution of this paper is a new meta-algorithm for classification that takes as input a large class of fairness constraints, with respect to multiple non-disjoint sensitive attributes, and which comes with provable guarantees. This is achieved by first developing a meta-algorithm for a large family of classification problems with convex constraints, and then showing that classification problems with general types of fairness constraints can be reduced to those in this family. We present empirical results that show that our algorithm can achieve near-perfect fairness with respect to various fairness metrics, and that the loss in accuracy due to the imposed fairness constraints is often small. Overall, this work unifies several prior works on fair classification, presents a practical algorithm with theoretical guarantees, and can handle fairness metrics that were previously not possible.

algorithm, labor law, law enforcement, (23 more...)

1806.06055

Country:

Europe (1.00)
North America > United States > California (0.46)

Genre: Research Report (0.64)

Industry:

Government (0.67)
Law > Civil Rights & Constitutional Law (0.67)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)
Law > Labor & Employment Law (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)