AITopics | Bhargava, Aniruddha

Collaborating Authors

Bhargava, Aniruddha

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Off-Policy Evaluation from Logged Human Feedback

Bhargava, Aniruddha, Jain, Lalit, Kveton, Branislav, Liu, Ge, Mukherjee, Subhojyoti

arXiv.org Machine LearningJun-14-2024

Learning from human feedback has been central to recent advances in artificial intelligence and machine learning. Since the collection of human feedback is costly, a natural question to ask is if the new feedback always needs to collected. Or could we evaluate a new model with the human feedback on responses of another model? This motivates us to study off-policy evaluation from logged human feedback. We formalize the problem, propose both model-based and model-free estimators for policy values, and show how to optimize them. We analyze unbiasedness of our estimators and evaluate them empirically. Our estimators can predict the absolute values of evaluated policies, rank them, and be optimized.

data mining, large language model, machine learning, (22 more...)

arXiv.org Machine Learning

2406.1003

Country:

North America > United States > Wisconsin (0.14)
North America > United States > Illinois (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Pessimistic Off-Policy Multi-Objective Optimization

Alizadeh, Shima, Bhargava, Aniruddha, Gopalswamy, Karthick, Jain, Lalit, Kveton, Branislav, Liu, Ge

arXiv.org Machine LearningOct-28-2023

Multi-objective optimization is a type of decision making problems where multiple conflicting objectives are optimized. We study offline optimization of multi-objective policies from data collected by an existing policy. We propose a pessimistic estimator for the multi-objective policy values that can be easily plugged into existing formulas for hypervolume computation and optimized. The estimator is based on inverse propensity scores (IPS), and improves upon a naive IPS estimator in both theory and experiments. Our analysis is general, and applies beyond our IPS estimators and methods for optimizing them. The pessimistic estimator can be optimized by policy gradients and performs well in all of our experiments.

artificial intelligence, machine learning, optimization, (18 more...)

arXiv.org Machine Learning

2310.18617

Country:

Europe > Netherlands (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Neural Reconstruction with Approximate Message Passing (NeuRAMP)

Fletcher, Alyson K., Rangan, Sundeep, Varshney, Lav R., Bhargava, Aniruddha

Neural Information Processing SystemsFeb-14-2020, 23:59:23 GMT

Many functional descriptions of spiking neurons assume a cascade structure where inputs are passed through an initial linear filtering stage that produces a low-dimensional signal that drives subsequent nonlinear stages. This paper presents a novel and systematic parameter estimation procedure for such models and applies the method to two neural estimation problems: (i) compressed-sensing based neural mapping from multi-neuron excitation, and (ii) estimation of neural receptive yields in sensory neurons. The proposed estimation algorithm models the neurons via a graphical model and then estimates the parameters in the model using a recently-developed generalized approximate message passing (GAMP) method. The GAMP method is based on Gaussian approximations of loopy belief propagation. In the neural connectivity problem, the GAMP-based method is shown to be computational efficient, provides a more exact modeling of the sparsity, can incorporate nonlinearities in the output and significantly outperforms previous compressed-sensing methods.

artificial intelligence, machine learning, message passing, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Architecture > Distributed Systems (0.65)
Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Linear Bandits with Feature Feedback

Oswal, Urvashi, Bhargava, Aniruddha, Nowak, Robert

arXiv.org Machine LearningMar-11-2019

This paper explores a new form of the linear bandit problem in which the algorithm receives the usual stochastic rewards as well as stochastic feedback about which features are relevant to the rewards, the latter feedback being the novel aspect. The focus of this paper is the development of new theory and algorithms for linear bandits with feature feedback. We show that linear bandits with feature feedback can achieve regret over time horizon $T$ that scales like $k\sqrt{T}$, without prior knowledge of which features are relevant nor the number $k$ of relevant features. In comparison, the regret of traditional linear bandits is $d\sqrt{T}$, where $d$ is the total number of (relevant and irrelevant) features, so the improvement can be dramatic if $k\ll d$. The computational complexity of the new algorithm is proportional to $k$ rather than $d$, making it much more suitable for real-world applications compared to traditional linear bandits. We demonstrate the performance of the new algorithm with synthetic and real human-labeled data.

algorithm, artificial intelligence, big data, (21 more...)

arXiv.org Machine Learning

1903.03705

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Data Science > Data Mining > Big Data (0.35)

Add feedback

Scalable Generalized Linear Bandits: Online Computation and Hashing

Jun, Kwang-Sung, Bhargava, Aniruddha, Nowak, Robert, Willett, Rebecca

Neural Information Processing SystemsDec-31-2017

Generalized Linear Bandits (GLBs), a natural extension of the stochastic linear bandits, has been popular and successful in recent years. However, existing GLBs scale poorly with the number of rounds and the number of arms, limiting their utility in practice. This paper proposes new, scalable solutions to the GLB problem in two respects. First, unlike existing GLBs, whose per-time-step space and time complexity grow at least linearly with time $t$, we propose a new algorithm that performs online computations to enjoy a constant space and time complexity. At its heart is a novel Generalized Linear extension of the Online-to-confidence-set Conversion (GLOC method) that takes \emph{any} online learning algorithm and turns it into a GLB algorithm. As a special case, we apply GLOC to the online Newton step algorithm, which results in a low-regret GLB algorithm with much lower time and memory complexity than prior work. Second, for the case where the number $N$ of arms is very large, we propose new algorithms in which each next arm is selected via an inner product search. Such methods can be implemented via hashing algorithms (i.e., ``hash-amenable'') and result in a time complexity sublinear in $N$. While a Thompson sampling extension of GLOC is hash-amenable, its regret bound for $d$-dimensional arm sets scales with $d^{3/2}$, whereas GLOC's regret bound scales with $d$. Towards closing this gap, we propose a new hash-amenable algorithm whose regret bound scales with $d^{5/4}$. Finally, we propose a fast approximate hash-key computation (inner product) with a better accuracy than the state-of-the-art, which can be of independent interest. We conclude the paper with preliminary experimental results confirming the merits of our methods.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Scalable Generalized Linear Bandits: Online Computation and Hashing

Jun, Kwang-Sung, Bhargava, Aniruddha, Nowak, Robert, Willett, Rebecca

arXiv.org Machine LearningOct-21-2017

Generalized Linear Bandits (GLBs), a natural extension of the stochastic linear bandits, has been popular and successful in recent years. However, existing GLBs scale poorly with the number of rounds and the number of arms, limiting their utility in practice. This paper proposes new, scalable solutions to the GLB problem in two respects. First, unlike existing GLBs, whose per-time-step space and time complexity grow at least linearly with time $t$, we propose a new algorithm that performs online computations to enjoy a constant space and time complexity. At its heart is a novel Generalized Linear extension of the Online-to-confidence-set Conversion (GLOC method) that takes \emph{any} online learning algorithm and turns it into a GLB algorithm. As a special case, we apply GLOC to the online Newton step algorithm, which results in a low-regret GLB algorithm with much lower time and memory complexity than prior work. Second, for the case where the number $N$ of arms is very large, we propose new algorithms in which each next arm is selected via an inner product search. Such methods can be implemented via hashing algorithms (i.e., "hash-amenable") and result in a time complexity sublinear in $N$. While a Thompson sampling extension of GLOC is hash-amenable, its regret bound for $d$-dimensional arm sets scales with $d^{3/2}$, whereas GLOC's regret bound scales with $d$. Towards closing this gap, we propose a new hash-amenable algorithm whose regret bound scales with $d^{5/4}$. Finally, we propose a fast approximate hash-key computation (inner product) with a better accuracy than the state-of-the-art, which can be of independent interest. We conclude the paper with preliminary experimental results confirming the merits of our methods.

algorithm, artificial intelligence, upstream oil & gas, (19 more...)

arXiv.org Machine Learning

1706.00136

Genre: Research Report > New Finding (0.48)

Industry:

Energy > Oil & Gas > Upstream (0.46)
Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Active Algorithms For Preference Learning Problems with Multiple Populations

Bhargava, Aniruddha, Ganti, Ravi, Nowak, Robert

arXiv.org Machine LearningJun-22-2016

In this paper we model the problem of learning preferences of a population as an active learning problem. We propose an algorithm can adaptively choose pairs of items to show to users coming from a heterogeneous population, and use the obtained reward to decide which pair of items to show next. We provide computationally efficient algorithms with provable sample complexity guarantees for this problem in both the noiseless and noisy cases. In the process of establishing sample complexity guarantees for our algorithms, we establish new results using a Nystr{\"o}m-like method which can be of independent interest. We supplement our theoretical results with experimental comparisons.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1603.04118

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Education > Focused Education > Special Education (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Robust Spatio-Temporal Signal Recovery from Noisy Counts in Social Media

Xu, Jun-Ming, Bhargava, Aniruddha, Nowak, Robert, Zhu, Xiaojin

arXiv.org Artificial IntelligenceApr-10-2012

Many real-world phenomena can be represented by a spatio-temporal signal: where, when, and how much. Social media is a tantalizing data source for those who wish to monitor such signals. Unlike most prior work, we assume that the target phenomenon is known and we are given a method to count its occurrences in social media. However, counting is plagued by sample bias, incomplete data, and, paradoxically, data scarcity -- issues inadequately addressed by prior work. We formulate signal recovery as a Poisson point process estimation problem. We explicitly incorporate human population bias, time delays and spatial distortions, and spatio-temporal regularization into the model to address the noisy count issues. We present an efficient optimization algorithm and discuss its theoretical properties. We show that our model is more accurate than commonly-used baselines. Finally, we present a case study on wildlife roadkill monitoring, where our model produces qualitatively convincing results.

optimization problem, social media, tweet, (20 more...)

arXiv.org Artificial Intelligence

1204.2248

Country: