Goto

Collaborating Authors

 Bharti, Shubham Kumar


In-Context Learning with Hypothesis-Class Guidance

arXiv.org Artificial Intelligence

Recent research has investigated the underlying mechanisms of in-context learning (ICL) both theoretically and empirically, often using data generated from simple function classes. However, the existing work often focuses on the sequence consisting solely of labeled examples, while in practice, labeled examples are typically accompanied by an instruction, providing some side information about the task. In this work, we propose ICL with hypothesis-class guidance (ICL-HCG), a novel synthetic data model for ICL where the input context consists of the literal description of a (finite) hypothesis class H and $(x,y)$ pairs from a hypothesis chosen from H. Under our framework ICL-HCG, we conduct extensive experiments to explore: (i) a variety of generalization abilities to new hypothesis classes; (ii) different model architectures; (iii) sample complexity; (iv) in-context data imbalance; (v) the role of instruction; and (vi) the effect of pretraining hypothesis diversity. As a result, we show that (a) Transformers can successfully learn ICL-HCG and generalize to unseen hypotheses and unseen hypothesis classes, and (b) compared with ICL without instruction, ICL-HCG achieves significantly higher accuracy, demonstrating the role of instructions.


Optimally Teaching a Linear Behavior Cloning Agent

arXiv.org Artificial Intelligence

We study optimal teaching of Linear Behavior Cloning (LBC) learners. In this setup, the teacher can select which states to demonstrate to an LBC learner. The learner maintains a version space of infinite linear hypotheses consistent with the demonstration. The goal of the teacher is to teach a realizable target policy to the learner using minimum number of state demonstrations. This number is known as the Teaching Dimension(TD). We present a teaching algorithm called ``Teach using Iterative Elimination(TIE)" that achieves instance optimal TD. However, we also show that finding optimal teaching set computationally is NP-hard. We further provide an approximation algorithm that guarantees an approximation ratio of $\log(|A|-1)$ on the teaching dimension. Finally, we provide experimental results to validate the efficiency and effectiveness of our algorithm.


Provable Defense against Backdoor Policies in Reinforcement Learning

arXiv.org Artificial Intelligence

We propose a provable defense mechanism against backdoor policies in reinforcement learning under subspace trigger assumption. A backdoor policy is a security threat where an adversary publishes a seemingly well-behaved policy which in fact allows hidden triggers. During deployment, the adversary can modify observed states in a particular way to trigger unexpected actions and harm the agent. We assume the agent does not have the resources to re-train a good policy. Instead, our defense mechanism sanitizes the backdoor policy by projecting observed states to a 'safe subspace', estimated from a small number of interactions with a clean (non-triggered) environment. Our sanitized policy achieves $\epsilon$ approximate optimality in the presence of triggers, provided the number of clean interactions is $O\left(\frac{D}{(1-\gamma)^4 \epsilon^2}\right)$ where $\gamma$ is the discounting factor and $D$ is the dimension of state space. Empirically, we show that our sanitization defense performs well on two Atari game environments.


The Teaching Dimension of Q-learning

arXiv.org Artificial Intelligence

In this paper, we initiate the study of sample complexity of teaching, termed as "teaching dimension" (TDim) in the literature, for Q-learning. While the teaching dimension of supervised learning has been studied extensively, these results do not extend to reinforcement learning due to the temporal constraints posed by the underlying Markov Decision Process environment. We characterize the TDim of Q-learning under different teachers with varying control over the environment, and present matching optimal teaching algorithms. Our TDim results provide the minimum number of samples needed for reinforcement learning, thus complementing standard PAC-style RL sample complexity analysis. Our teaching algorithms have the potential to speed up RL agent learning in applications where a helpful teacher is available.


On the relationship between multitask neural networks and multitask Gaussian Processes

arXiv.org Machine Learning

Multitask learning (MTL) is a learning paradigm in which multiple tasks are learned jointly, aiming to improve the performance of individual tasks by sharing information across tasks [4, 26], using various information sharing mechanisms. For example, MTL models based on deep neural networks commonly use shared hidden layers for all the tasks; probabilistic MTL models are usually based on shared priors over the parameters of the multiple tasks [16, 5]; Gaussian Process based models, e.g., multitask Gaussian Processes (GP) and extensions [2, 23], commonly employ covariance functions that models both inputs and task similarity. Multi-label, multi-class, multi-output learning can be seen as special cases of multitask learning where each task has the same set of inputs. Transfer learning is also similar to MTL, except that the objective of MTL is to improve the performance over all the tasks whereas the objective of transfer learning is to usually improve the performance of a target task by leveraging information from source tasks [26]. Zero-shot learning and few-shot learning are also closely related to MTL. Prior works [14, 24] have shown that a fully connected Bayesian neural network (NN) [13, 15] with a single, infinitely-wide hidden layer, with independent and identically distributed (i.i.d) priors on weights, is equivalent to a Gaussian Process. The result has recently been also generalized to deep Bayesian neural networks [9] with any number of hidden layers. These connections between Bayesian neural networks and GP offer many benefits, such as theoretical understanding of neural networks, efficient Bayesian inference for deep NN by learning the equivalent GP, etc. Motivated by the equivalence of deep Bayesian neural networks and GP, in this work, we investigate whether a similar connection exists between deep multitask Bayesian neural networks [18] and multitask Gaussian Processes