An Exploration-by-Optimization Approach to Best of Both Worlds in Linear Bandits
–Neural Information Processing Systems
In this paper, we consider how to construct best-of-both-worlds linear bandit algorithms that achieve nearly optimal performance for both stochastic and adversarial environments. For this purpose, we show that a natural approach referred to as exploration by optimization [Lattimore and Szepesvári, 2020b] works well.
Neural Information Processing Systems
Apr-30-2026, 02:09:01 GMT