Goto

Collaborating Authors

 unknown metric


Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Neural Information Processing Systems

Consider a nonparametric contextual multi-arm bandit problem where each arm $a \in [K]$ is associated to a nonparametric reward function $f_a: [0,1] \to \mathbb{R}$ mapping from contexts to the expected reward. Suppose that there is a large set of arms, yet there is a simple but unknown structure amongst the arm reward functions, e.g.


Reviews: Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Neural Information Processing Systems

The paper is clear, and makes efforts to highlight the behavior of the proposed algorithm (value of the regret bound for some specific settings, experiments measuring the impact of the metric). The comparison to other settings may still be enforced. In the experimental part, I would also appreciate the comparison to include some state of the art algorithms. What would be the empirical results of a gaussian process-based bandit? It would also be interesting to have results on datasets used by other contextual/similarity-based bandits (except that these datasets use a context in R d). Finally, it's surprising to have a context space of dimension 1. Extending the algorithm to R d setting seems strait-forward.


Reviews: Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Neural Information Processing Systems

This paper proposed a nonparametric approach to contextual bandits that can adapt to unknown simple structure between the arms. The reviewers found the paper to be novel in how it combined existing ideas to solve an interesting problem. Additionally, the reviewers found the paper to be clearly written and of potential significance for the problem they tacked. Some minor concerns were raised, however, the authors seem to have addressed them in their response. The authors should incorporate any suggested edits by the reviewers as well as the promised updates in the author response.


Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Neural Information Processing Systems

Consider a nonparametric contextual multi-arm bandit problem where each arm a \in [K] is associated to a nonparametric reward function f_a: [0,1] \to \mathbb{R} mapping from contexts to the expected reward. Suppose that there is a large set of arms, yet there is a simple but unknown structure amongst the arm reward functions, e.g. We present a novel algorithm which learns data-driven similarities amongst the arms, in order to implement adaptive partitioning of the context-arm space for more efficient learning. We provide regret bounds along with simulations that highlight the algorithm's dependence on the local geometry of the reward functions.


Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Wanigasekara, Nirandika, Yu, Christina

Neural Information Processing Systems

Consider a nonparametric contextual multi-arm bandit problem where each arm $a \in [K]$ is associated to a nonparametric reward function $f_a: [0,1] \to \mathbb{R}$ mapping from contexts to the expected reward. Suppose that there is a large set of arms, yet there is a simple but unknown structure amongst the arm reward functions, e.g. We present a novel algorithm which learns data-driven similarities amongst the arms, in order to implement adaptive partitioning of the context-arm space for more efficient learning. We provide regret bounds along with simulations that highlight the algorithm's dependence on the local geometry of the reward functions. Papers published at the Neural Information Processing Systems Conference.