Not enough data to create a plot.
Try a different view from the menu above.
Akshay Krishnamurthy
Model Selection for Contextual Bandits
Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo
Contextual bandits with surrogate losses: Margin bounds and efficient algorithms
Dylan J. Foster, Akshay Krishnamurthy
We use surrogate losses to obtain several new regret bounds and new algorithms for contextual bandit learning. Using the ramp loss, we derive new margin-based regret bounds in terms of standard sequential complexity measures of a benchmark class of real-valued regression functions. Using the hinge loss, we derive an efficient algorithm with a dT -type mistake bound against benchmark policies induced by d-dimensional regressors. Under realizability assumptions, our results also yield classical regret bounds.
paper
Akshay Krishnamurthy
In this section we provide a detailed proof for the main theorem. First we state some facts about the learning rate and the algorithm. This bound contains three parts. The first is an upper bound for the first step when there is no data. The third part is an "average" of the estimated future regret.
paper
Akshay Krishnamurthy
We study reinforcement learning in continuous state and action spaces endowed with a metric. We provide a refined analysis of the algorithm of Sinclair, Banerjee, and Yu (2019) and show that its regret scales with the zooming dimension of the instance. This parameter, which originates in the bandit literature, captures the size of the subsets of near optimal actions and is always smaller than the covering dimension used in previous analyses. As such, our results are the first provably adaptive guarantees for reinforcement learning in metric spaces.
Model Selection for Contextual Bandits
Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo
Sample Complexity of Learning Mixture of Sparse Linear Regressions
Akshay Krishnamurthy, Arya Mazumdar, Andrew McGregor, Soumyabrata Pal
In the problem of learning mixtures of linear regressions, the goal is to learn a collection of signal vectors from a sequence of (possibly noisy) linear measurements, where each measurement is evaluated on an unknown signal drawn uniformly from this collection. This setting is quite expressive and has been studied both in terms of practical applications and for the sake of establishing theoretical guarantees. In this paper, we consider the case where the signal vectors are sparse; this generalizes the popular compressed sensing paradigm. We improve upon the state-of-the-art results as follows: In the noisy case, we resolve an open question of Yin et al. (IEEE Transactions on Information Theory, 2019) by showing how to handle collections of more than two vectors and present the first robust reconstruction algorithm, i.e., if the signals are not perfectly sparse, we still learn a good sparse approximation of the signals. In the noiseless case, as well as in the noisy case, we show how to circumvent the need for a restrictive assumption required in the previous work. Our techniques are quite different from those in the previous work: for the noiseless case, we rely on a property of sparse polynomials and for the noisy case, we provide new connections to learning Gaussian mixtures and use ideas from the theory of error correcting codes.