Thresholding Bandit with Optimal Aggregate Regret
–Neural Information Processing Systems
We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal.
Neural Information Processing Systems
Nov-17-2025, 22:46:56 GMT