AITopics | Balsubramani, Akshay

Collaborating Authors

Balsubramani, Akshay

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Laws of thermodynamics for exponential families

Balsubramani, Akshay

arXiv.org Artificial IntelligenceJan-3-2025

Most learning problems can be solved by minimization of log loss. This bare fact is inescapable in modern AI and machine learning - the variety is in the details. What is the space of measured data? What is the support of the distribution? Changing such properties of the problem fundamentally changes learning behavior, leading to the variety of modeling approaches successfully used in data science. But for many inference and decision-making tasks, log loss can be axiomatically inescapable. We explore such loss minimization problems in the language of statistical mechanics, which studies how systems of "particles" like atoms can be approximately described by relatively few bulk properties. There is a direct analogue to modeling, where large datasets are described by relatively few model parameters.

artificial intelligence, exponential family, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2501.02071

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Entropy, concentration, and learning: a statistical mechanics primer

Balsubramani, Akshay

arXiv.org Machine LearningSep-27-2024

Artificial intelligence models trained through loss minimization have demonstrated significant success, grounded in principles from fields like information theory and statistical physics. This work explores these established connections through the lens of statistical mechanics, starting from first-principles sample concentration behaviors that underpin AI and machine learning. Our development of statistical mechanics for modeling highlights the key role of exponential families, and quantities of statistics, physics, and information theory.

artificial intelligence, machine learning, statistical mechanics, (17 more...)

arXiv.org Machine Learning

2409.1863

Country:

Europe > United Kingdom > England (0.14)
North America > United States (0.14)
Europe > France (0.14)

Genre:

Research Report (0.63)
Instructional Material (0.45)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Add feedback

p-value peeking and estimating extrema

Balsubramani, Akshay

arXiv.org Machine LearningNov-2-2020

A pervasive issue in statistical hypothesis testing is that the reported $p$-values are biased downward by data "peeking" -- the practice of reporting only progressively extreme values of the test statistic as more data samples are collected. We develop principled mechanisms to estimate such running extrema of test statistics, which directly address the effect of peeking in some general scenarios.

artificial intelligence, health & medicine, martingale, (16 more...)

arXiv.org Machine Learning

2011.01343

Country: North America > United States (0.68)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Learning transport cost from subset correspondence

Liu, Ruishan, Balsubramani, Akshay, Zou, James

arXiv.org Machine LearningSep-29-2019

Learning to align multiple datasets is an important problem with many applications, and it is especially useful when we need to integrate multiple experiments or correct for confounding. Optimal transport (OT) is a principled approach to align datasets, but a key challenge in applying OT is that we need to specify a transport cost function that accurately captures how the two datasets are related. Reliable cost functions are typically not available and practitioners often resort to using hand-crafted or Euclidean cost even if it may not be appropriate. In this work, we investigate how to learn the cost function using a small amount of side information which is often available. The side information we consider captures subset correspondence---i.e. certain subsets of points in the two data sets are known to be related. For example, we may have some images labeled as cars in both datasets; or we may have a common annotated cell type in single-cell data from two batches. We develop an end-to-end optimizer (OT-SI) that differentiates through the Sinkhorn algorithm and effectively learns the suitable cost function from side information. On systematic experiments in images, marriage-matching and single-cell RNA-seq, our method substantially outperform state-of-the-art benchmarks.

dataset, neural network, optimization problem, (19 more...)

arXiv.org Machine Learning

1909.13203

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

An adaptive nearest neighbor rule for classification

Balsubramani, Akshay, Dasgupta, Sanjoy, Freund, Yoav, Moran, Shay

arXiv.org Artificial IntelligenceMay-29-2019

We introduce a variant of the $k$-nearest neighbor classifier in which $k$ is chosen adaptively for each query, rather than supplied as a parameter. The choice of $k$ depends on properties of each neighborhood, and therefore may significantly vary between different points. (For example, the algorithm will use larger $k$ for predicting the labels of points in noisy regions.) We provide theory and experiments that demonstrate that the algorithm performs comparably to, and sometimes better than, $k$-NN with an optimal choice of $k$. In particular, we derive bounds on the convergence rates of our classifier that depend on a local quantity we call the `advantage' which is significantly weaker than the Lipschitz conditions used in previous convergence rate proofs. These generalization bounds hinge on a variant of the seminal Uniform Convergence Theorem due to Vapnik and Chervonenkis; this variant concerns conditional probabilities and may be of independent interest.

artificial intelligence, convergence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1905.12717

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Semantically Decomposing the Latent Spaces of Generative Adversarial Networks

Donahue, Chris, Lipton, Zachary C., Balsubramani, Akshay, McAuley, Julian

arXiv.org Artificial IntelligenceOct-31-2017

We propose a new algorithm for training generative adversarial networks that jointly learns latent codes for both identities (e.g. individual humans) and observations (e.g. specific photographs). By fixing the identity portion of the latent codes, we can generate diverse images of the same subject, and by fixing the observation portion, we can traverse the manifold of subjects while maintaining contingent aspects such as lighting and pose. Our algorithm features a pairwise training scheme in which each sample from the generator consists of two images with a common identity code. Corresponding samples from the real dataset consist of two distinct photographs of the same subject. In order to fool the discriminator, the generator must produce pairs that are photorealistic, distinct, and appear to depict the same individual. We augment both the DCGAN and BEGAN approaches with Siamese discriminators to facilitate pairwise training. Experiments with human judges and an off-the-shelf face verification system demonstrate our algorithm's ability to generate convincing, identity-matched photographs.

artificial intelligence, discriminator, neural network, (18 more...)

arXiv.org Artificial Intelligence

1705.07904

Country: North America > United States > California (0.28)

Industry: Information Technology > Security & Privacy (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Linking Generative Adversarial Learning and Binary Classification

Balsubramani, Akshay

arXiv.org Machine LearningSep-5-2017

In this note, we point out a basic link between generative adversarial (GA) training and binary classification -- any powerful discriminator essentially computes an (f-)divergence between real and generated samples. The result, repeatedly re-derived in decision theory, has implications for GA Networks (GANs), providing an alternative perspective on training f-GANs by designing the discriminator loss function.

artificial intelligence, divergence, reinforcement learning, (15 more...)

arXiv.org Machine Learning

1709.01509

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.41)

Add feedback

Optimal Binary Classifier Aggregation for General Losses

Balsubramani, Akshay, Freund, Yoav S.

Neural Information Processing SystemsDec-31-2016

We address the problem of aggregating an ensemble of predictors with known loss bounds in a semi-supervised binary classification setting, to minimize prediction loss incurred on the unlabeled data. We find the minimax optimal predictions for a very general class of loss functions including all convex and many non-convex losses, extending a recent analysis of the problem for misclassification error. The result is a family of semi-supervised ensemble aggregation algorithms which are as efficient as linear learning by convex optimization, but are minimax optimal without any relaxations. Their decision rules take a form familiar in decision theory -- applying sigmoid functions to a notion of ensemble margin -- without the assumptions typically made in margin-based learning.

artificial intelligence, machine learning, prediction, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
Europe > Spain (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.36)

Add feedback

Learning to Abstain from Binary Prediction

Balsubramani, Akshay

arXiv.org Machine LearningNov-29-2016

A binary classifier capable of abstaining from making a label prediction has two goals in tension: minimizing errors, and avoiding abstaining unnecessarily often. In this work, we exactly characterize the best achievable tradeoff between these two goals in a general semi-supervised setting, given an ensemble of predictors of varying competence as well as unlabeled data on which we wish to predict or abstain. We give an algorithm for learning a classifier in this setting which trades off its errors with abstentions in a minimax optimal manner, is as efficient as linear learning and prediction, and is demonstrably practical. Our analysis extends to a large class of loss functions and other scenarios, including ensembles comprised of specialists that can themselves abstain.

classifier, neural network, optimization problem, (18 more...)

arXiv.org Machine Learning

1602.08151

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology: