Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric
–Neural Information Processing Systems
Consider a nonparametric contextual multi-arm bandit problem where each arm $a \in [K]$ is associated to a nonparametric reward function $f_a: [0,1] \to \mathbb{R}$ mapping from contexts to the expected reward. Suppose that there is a large set of arms, yet there is a simple but unknown structure amongst the arm reward functions, e.g.
Neural Information Processing Systems
Dec-25-2025, 21:01:07 GMT
- Technology: