AITopics | Kamalaruban, Parameswaran

Collaborating Authors

Kamalaruban, Parameswaran

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust Inverse Reinforcement Learning under Transition Dynamics Mismatch

Viano, Luca, Huang, Yu-Ting, Kamalaruban, Parameswaran, Cevher, Volkan

arXiv.org Machine LearningJul-2-2020

We study the inverse reinforcement learning (IRL) problem under the \emph{transition dynamics mismatch} between the expert and the learner. In particular, we consider the Maximum Causal Entropy (MCE) IRL learner model and provide an upper bound on the learner's performance degradation based on the $\ell_1$-distance between the two transition dynamics of the expert and the learner. Then, by leveraging insights from the Robust RL literature, we propose a robust MCE IRL algorithm, which is a principled approach to help with this mismatch issue. Finally, we empirically demonstrate the stable performance of our algorithm compared to the standard MCE IRL algorithm under transition mismatches in finite MDP problems.

artificial intelligence, reinforcement learning, total return total return, (14 more...)

arXiv.org Machine Learning

2007.01174

Country: North America > United States (0.45)

Genre: Research Report (0.64)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Interaction-limited Inverse Reinforcement Learning

Troussard, Martin, Pignat, Emmanuel, Kamalaruban, Parameswaran, Calinon, Sylvain, Cevher, Volkan

arXiv.org Machine LearningJul-1-2020

This paper proposes an inverse reinforcement learning (IRL) framework to accelerate learning when the learner-teacher \textit{interaction} is \textit{limited} during training. Our setting is motivated by the realistic scenarios where a helpful teacher is not available or when the teacher cannot access the learning dynamics of the student. We present two different training strategies: Curriculum Inverse Reinforcement Learning (CIRL) covering the teacher's perspective, and Self-Paced Inverse Reinforcement Learning (SPIRL) focusing on the learner's perspective. Using experiments in simulations and experiments with a real robot learning a task from a human demonstrator, we show that our training strategies can allow a faster training than a random teacher for CIRL and than a batch learner for SPIRL.

artificial intelligence, demonstration, survey article, (15 more...)

arXiv.org Machine Learning

2007.00425

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Environment Shaping in Reinforcement Learning using State Abstraction

Kamalaruban, Parameswaran, Devidze, Rati, Cevher, Volkan, Singla, Adish

arXiv.org Machine LearningJun-23-2020

One of the central challenges faced by a reinforcement learning (RL) agent is to effectively learn a (near-)optimal policy in environments with large state spaces having sparse and noisy feedback signals. In real-world applications, an expert with additional domain knowledge can help in speeding up the learning process via \emph{shaping the environment}, i.e., making the environment more learner-friendly. A popular paradigm in literature is \emph{potential-based reward shaping}, where the environment's reward function is augmented with additional local rewards using a potential function. However, the applicability of potential-based reward shaping is limited in settings where (i) the state space is very large, and it is challenging to compute an appropriate potential function, (ii) the feedback signals are noisy, and even with shaped rewards the agent could be trapped in local optima, and (iii) changing the rewards alone is not sufficient, and effective shaping requires changing the dynamics. We address these limitations of potential-based shaping methods and propose a novel framework of \emph{environment shaping using state abstraction}. Our key idea is to compress the environment's large state space with noisy signals to an abstracted space, and to use this abstraction in creating smoother and more effective feedback signals for the agent. We study the theoretical underpinnings of our abstraction-based environment shaping, and show that the agent's policy learnt in the shaped environment preserves near-optimal behavior in the original environment.

agent, artificial intelligence, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2006.1316

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Interactive Teaching Algorithms for Inverse Reinforcement Learning

Kamalaruban, Parameswaran, Devidze, Rati, Cevher, Volkan, Singla, Adish

arXiv.org Artificial IntelligenceJun-5-2019

We study the problem of inverse reinforcement learning (IRL) with the added twist that the learner is assisted by a helpful teacher. More formally, we tackle the following algorithmic question: How could a teacher provide an informative sequence of demonstrations to an IRL learner to speed up the learning process? We present an interactive teaching framework where a teacher adaptively chooses the next demonstration based on learner's current policy. In particular, we design teaching algorithms for two concrete settings: an omniscient setting where a teacher has full knowledge about the learner's dynamics and a blackbox setting where the teacher has minimal knowledge. Then, we study a sequential variant of the popular MCE-IRL learner and prove convergence guarantees of our teaching algorithm in the omniscient setting. Extensive experiments with a car driving simulator environment show that the learning progress can be speeded up drastically as compared to an uninformative teacher.

artificial intelligence, learner, survey article, (21 more...)

arXiv.org Artificial Intelligence

1905.11867

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Iterative Classroom Teaching

Yeo, Teresa, Kamalaruban, Parameswaran, Singla, Adish, Merchant, Arpit, Asselborn, Thibault, Faucon, Louis, Dillenbourg, Pierre, Cevher, Volkan

arXiv.org Machine LearningNov-12-2018

We consider the machine teaching problem in a classroom-like setting wherein the teacher has to deliver the same examples to a diverse group of students. Their diversity stems from differences in their initial internal states as well as their learning rates. We prove that a teacher with full knowledge about the learning dynamics of the students can teach a target concept to the entire classroom using O(min{d,N} log(1/eps)) examples, where d is the ambient dimension of the problem, N is the number of learners, and eps is the accuracy parameter. We show the robustness of our teaching strategy when the teacher has limited knowledge of the learners' internal dynamics as provided by a noisy oracle. Further, we study the trade-off between the learners' workload and the teacher's cost in teaching the target concept. Our experiments validate our theoretical results and suggest that appropriately partitioning the classroom into homogenous groups provides a balance between these two objectives.

artificial intelligence, educational setting, learner, (18 more...)

arXiv.org Machine Learning

1811.03537

Genre:

Research Report (0.82)
Instructional Material (0.68)

Industry: Education > Educational Setting (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

$d_{\mathcal{X}}$-Private Mechanisms for Linear Queries

Kamalaruban, Parameswaran, Perrier, Victor, Asghar, Hassan Jameel, Kaafar, Mohamed Ali

arXiv.org Machine LearningJun-6-2018

Differential Privacy is one of the strongest privacy guarantees, which allows the release of useful information about any sensitive dataset. However, it provides the same level of protection for all elements in the data universe. In this paper, we consider $d_{\mathcal{X}}$-privacy, an instantiation of the privacy notion introduced in \cite{chatzikokolakis2013broadening}, which allows specifying a separate privacy budget for each pair of elements in the data universe. We describe a systematic procedure to tailor any existing differentially private mechanism into a $d_{\mathcal{X}}$-private variant for the case of linear queries. For the resulting $d_{\mathcal{X}}$-private mechanisms, we provide theoretical guarantees on the trade-off between utility and privacy, and show that they always outperform their \emph{vanilla} counterpart. We demonstrate the effectiveness of our procedure, by evaluating the proposed $d_{\mathcal{X}}$-private Laplace mechanism on both synthetic and real datasets using a set of randomly generated linear queries.

artificial intelligence, machine learning, mechanism, (17 more...)

arXiv.org Machine Learning

1806.02389

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Minimax Lower Bounds for Cost Sensitive Classification

Kamalaruban, Parameswaran, Williamson, Robert C.

arXiv.org Machine LearningMay-20-2018

The cost-sensitive classification problem plays a crucial role in mission-critical machine learning applications, and differs with traditional classification by taking the misclassification costs into consideration. Although being studied extensively in the literature, the fundamental limits of this problem are still not well understood. We investigate the hardness of this problem by extending the standard minimax lower bound of balanced binary classification problem (due to \cite{massart2006risk}), and emphasize the impact of cost terms on the hardness.

artificial intelligence, bayesian inference, classification problem, (18 more...)

arXiv.org Machine Learning

1805.07723

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
(2 more...)

Add feedback

Exp-Concavity of Proper Composite Losses

Kamalaruban, Parameswaran, Williamson, Robert C., Zhang, Xinhua

arXiv.org Machine LearningMay-20-2018

The goal of online prediction with expert advice is to find a decision strategy which will perform almost as well as the best expert in a given pool of experts, on any sequence of outcomes. This problem has been widely studied and $O(\sqrt{T})$ and $O(\log{T})$ regret bounds can be achieved for convex losses (\cite{zinkevich2003online}) and strictly convex losses with bounded first and second derivatives (\cite{hazan2007logarithmic}) respectively. In special cases like the Aggregating Algorithm (\cite{vovk1995game}) with mixable losses and the Weighted Average Algorithm (\cite{kivinen1999averaging}) with exp-concave losses, it is possible to achieve $O(1)$ regret bounds. \cite{van2012exp} has argued that mixability and exp-concavity are roughly equivalent under certain conditions. Thus by understanding the underlying relationship between these two notions we can gain the best of both algorithms (strong theoretical performance guarantees of the Aggregating Algorithm and the computational efficiency of the Weighted Average Algorithm). In this paper we provide a complete characterization of the exp-concavity of any proper composite loss. Using this characterization and the mixability condition of proper losses (\cite{van2012mixability}), we show that it is possible to transform (re-parameterize) any $\beta$-mixable binary proper loss into a $\beta$-exp-concave composite loss with the same $\beta$. In the multi-class case, we propose an approximation approach for this transformation.

artificial intelligence, composite loss, machine learning, (18 more...)

arXiv.org Machine Learning

1805.07737

Country: Oceania > Australia (0.28)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Consistent Robust Regression

Bhatia, Kush, Jain, Prateek, Kamalaruban, Parameswaran, Kar, Purushottam

Neural Information Processing SystemsDec-31-2017

We present the first efficient and provably consistent estimator for the robust regression problem. The area of robust learning and optimization has generated a significant amount of interest in the learning and statistics communities in recent years owing to its applicability in scenarios with corrupted data, as well as in handling model mis-specifications. In particular, special interest has been devoted to the fundamental problem of robust linear regression where estimators that can tolerate corruption in up to a constant fraction of the response variables are widely studied. Surprisingly however, to this date, we are not aware of a polynomial time estimator that offers a consistent estimate in the presence of dense, unbounded corruptions. In this work we present such an estimator, called CRR. This solves an open problem put forward in the work of (Bhatia et al, 2015). Our consistency analysis requires a novel two-stage proof technique involving a careful analysis of the stability of ordered lists which may be of independent interest. We show that CRR not only offers consistent estimates, but is empirically far superior to several other recently proposed algorithms for the robust regression problem, including extended Lasso and the TORRENT algorithm. In comparison, CRR offers comparable or better model recovery but with runtimes that are faster by an order of magnitude.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Technology:

Information Technology > Data Science (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Efficient and Consistent Robust Time Series Analysis

Bhatia, Kush, Jain, Prateek, Kamalaruban, Parameswaran, Kar, Purushottam

arXiv.org Machine LearningJul-1-2016

We study the problem of robust time series analysis under the standard auto-regressive (AR) time series model in the presence of arbitrary outliers. We devise an efficient hard thresholding based algorithm which can obtain a consistent estimate of the optimal AR model despite a large fraction of the time series points being corrupted. Our algorithm alternately estimates the corrupted set of points and the model parameters, and is inspired by recent advances in robust regression and hard-thresholding methods. However, a direct application of existing techniques is hindered by a critical difference in the time-series domain: each point is correlated with all previous points rendering existing tools inapplicable directly. We show how to overcome this hurdle using novel proof techniques. Using our techniques, we are also able to provide the first efficient and provably consistent estimator for the robust regression problem where a standard linear observation model with white additive noise is corrupted arbitrarily. We illustrate our methods on synthetic datasets and show that our methods indeed are able to consistently recover the optimal parameters despite a large fraction of points being corrupted.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Machine Learning

1607.00146

Country: Asia > India (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Time Series Analysis (0.60)

Add feedback