AITopics | Agarwal, Sushant

Collaborating Authors

Agarwal, Sushant

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Aggregating Data for Optimal and Private Learning

Agarwal, Sushant, Makhija, Yukti, Saket, Rishi, Raghuveer, Aravindan

arXiv.org Artificial IntelligenceNov-28-2024

In many applications however, due to lack of instrumentation or annotators [ Chen et al., 2004, Dery et al., 2017 ], or privacy constraints [ Rueping, 2010 ], instance-wise labels may not be available. Instead, the dat aset is partitioned into disjoint sets or bags of instances, and for each bag only one bag-label is available to the learner. The bag-label is derived from th e undisclosed instance-labels present in the bag via some agg regation function depending on the scenario. The goal is to train a model predicting the labels of individual i nstances. We call this paradigm as learning from aggregate labels, which directly generalizes traditional supervised learning, the latter being the special case of unit-sized bags. The two formalizations of our focus are ( i) multiple instance regression (MIR) where the bag-label is one of the instance-labels of the bag, and the in stance whose label is chosen as the bag-label is not revealed, and (ii) learning from label proportions (LLP) in which the bag-label is the average of the bag's instance-labels. In MIR as well as in LLP, our work considers real-valued instance-labels with regression as the underlying instance-level task.

artificial intelligence, machine learning, random 0, (17 more...)

arXiv.org Artificial Intelligence

2411.19045

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

On the Power of Randomization in Fair Classification and Representation

Agarwal, Sushant, Deshpande, Amit

arXiv.org Artificial IntelligenceJun-5-2024

Fair classification and fair representation learning are two important problems in supervised and unsupervised fair machine learning, respectively. Fair classification asks for a classifier that maximizes accuracy on a given data distribution subject to fairness constraints. Fair representation maps a given data distribution over the original feature space to a distribution over a new representation space such that all classifiers over the representation satisfy fairness. In this paper, we examine the power of randomization in both these problems to minimize the loss of accuracy that results when we impose fairness constraints. Previous work on fair classification has characterized the optimal fair classifiers on a given data distribution that maximize accuracy subject to fairness constraints, e.g., Demographic Parity (DP), Equal Opportunity (EO), and Predictive Equality (PE). We refine these characterizations to demonstrate when the optimal randomized fair classifiers can surpass their deterministic counterparts in accuracy. We also show how the optimal randomized fair classifier that we characterize can be obtained as a solution to a convex optimization problem. Recent work has provided techniques to construct fair representations for a given data distribution such that any classifier over this representation satisfies DP. However, the classifiers on these fair representations either come with no or weak accuracy guarantees when compared to the optimal fair classifier on the original data distribution. Extending our ideas for randomized fair classification, we improve on these works, and construct DP-fair, EO-fair, and PE-fair representations that have provably optimal accuracy and suffer no accuracy loss compared to the optimal DP-fair, EO-fair, and PE-fair classifiers respectively on the original data distribution.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2406.03142

Country:

North America > United States (0.46)
North America > Canada (0.46)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Private Mean Estimation with Person-Level Differential Privacy

Agarwal, Sushant, Kamath, Gautam, Majid, Mahbod, Mouzakis, Argyris, Silver, Rose, Ullman, Jonathan

arXiv.org Machine LearningMay-30-2024

We study differentially private (DP) mean estimation in the case where each person holds multiple samples. Commonly referred to as the "user-level" setting, DP here requires the usual notion of distributional stability when all of a person's datapoints can be modified. Informally, if $n$ people each have $m$ samples from an unknown $d$-dimensional distribution with bounded $k$-th moments, we show that \[n = \tilde \Theta\left(\frac{d}{\alpha^2 m} + \frac{d }{ \alpha m^{1/2} \varepsilon} + \frac{d}{\alpha^{k/(k-1)} m \varepsilon} + \frac{d}{\varepsilon}\right)\] people are necessary and sufficient to estimate the mean up to distance $\alpha$ in $\ell_2$-norm under $\varepsilon$-differential privacy (and its common relaxations). In the multivariate setting, we give computationally efficient algorithms under approximate DP (with slightly degraded sample complexity) and computationally inefficient algorithms under pure DP, and our nearly matching lower bounds hold for the most permissive case of approximate DP. Our computationally efficient estimators are based on the well known noisy-clipped-mean approach, but the analysis for our setting requires new bounds on the tails of sums of independent, vector-valued, bounded-moments random variables, and a new argument for bounding the bias introduced by clipping.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Machine Learning

2405.20405

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Impossibility results for fair representations

Lechner, Tosca, Ben-David, Shai, Agarwal, Sushant, Ananthakrishnan, Nivasini

arXiv.org Machine LearningJul-7-2021

With the growing awareness to fairness in machine learning and the realization of the central role that data representation has in data processing tasks, there is an obvious interest in notions of fair data representations. The goal of such representations is that a model trained on data under the representation (e.g., a classifier) will be guaranteed to respect some fairness constraints. Such representations are useful when they can be fixed for training models on various different tasks and also when they serve as data filtering between the raw data (known to the representation designer) and potentially malicious agents that use the data under the representation to learn predictive models and make decisions. A long list of recent research papers strive to provide tools for achieving these goals. However, we prove that this is basically a futile effort. Roughly stated, we prove that no representation can guarantee the fairness of classifiers for different tasks trained using it; even the basic goal of achieving label-independent Demographic Parity fairness fails once the marginal data distribution shifts. More refined notions of fairness, like Odds Equality, cannot be guaranteed by a representation that does not take into account the task specific labeling rule with respect to which such fairness will be evaluated (even if the marginal data distribution is known a priory). Furthermore, except for trivial cases, no representation can guarantee Odds Equality fairness for any two different tasks, while allowing accurate label predictions for both. While some of our conclusions are intuitive, we formulate (and prove) crisp statements of such impossibilities, often contrasting impressions conveyed by many recent works on fair representations.

artificial intelligence, representation, survey article, (18 more...)

arXiv.org Machine Learning

2107.03483

Country: North America > Canada > Ontario (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Law (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback