Verma, Nakul, Branson, Kristin

Metric learning seeks a transformation of the feature space that enhances prediction qualityfor a given task. In this work we provide PACstyle sample complexity rates for supervised metric learning. We give matching lower-and upper-bounds showing that sample complexity scales with the representation dimension when no assumptions are made about the underlying data distribution. In addition, by leveraging the structure of the data distribution, we provide rates fine-tuned to a specific notion of the intrinsic complexity of a given dataset, allowing us to relax the dependence on representation dimension. We show both theoretically and empirically thataugmenting the metric learning optimization criterion with a simple norm-based regularization is important and can help adapt to a dataset's intrinsic complexityyielding better generalization, thus partly explaining the empirical success of similar regularizations reported in previous works.

The multi-armed bandit (MAB) setting is a useful abstraction of many online learning tasks which focuses on the trade-off between exploration and exploitation. In this setting, an online algorithm has a fixed set of alternatives ("arms"), and in each round it selects one arm and then observes the corresponding reward. While the case of small number of arms is by now well-understood, a lot of recent work has focused on multi-armed bandits with (infinitely) many arms, where one needs to assume extra structure in order to make the problem tractable. In particular, in the Lipschitz MAB problem there is an underlying similarity metric space, known to the algorithm, such that any two arms that are close in this metric space have similar payoffs. In this paper we consider the more realistic scenario in which the metric space is *implicit* -- it is defined by the available structure but not revealed to the algorithm directly.

Verma, Nakul, Branson, Kristin

Metric learning seeks a transformation of the feature space that enhances prediction quality for the given task at hand. In this work we provide PAC-style sample complexity rates for supervised metric learning. We give matching lower- and upper-bounds showing that the sample complexity scales with the representation dimension when no assumptions are made about the underlying data distribution. However, by leveraging the structure of the data distribution, we show that one can achieve rates that are fine-tuned to a specific notion of intrinsic complexity for a given dataset. Our analysis reveals that augmenting the metric learning optimization criterion with a simple norm-based regularization can help adapt to a dataset's intrinsic complexity, yielding better generalization. Experiments on benchmark datasets validate our analysis and show that regularizing the metric can help discern the signal even when the data contains high amounts of noise.

Wanigasekara, Nirandika, Yu, Christina

Suppose that there is a large set of arms, yet there is a simple but unknown structure amongst the arm reward functions, e.g. We present a novel algorithm which learns data-driven similarities amongst the arms, in order to implement adaptive partitioning of the context-arm space for more efficient learning. We provide regret bounds along with simulations that highlight the algorithm's dependence on the local geometry of the reward functions. Papers published at the Neural Information Processing Systems Conference.

Combes, Richard, Shahi, Mohammad Sadegh Talebi Mazraeh, Proutiere, Alexandre, lelarge, marc

This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ESCB has better performance guarantees than existing algorithms, and significantly outperforms these algorithms in practice. In the adversarial setting under bandit feedback, we propose CombEXP, an algorithm with the same regret scaling as state-of-the-art algorithms, but with lower computational complexity for some combinatorial problems.