On Regret with Multiple Best Arms
–Neural Information Processing Systems
We study a regret minimization problem with the existence of multiple best/near-optimal arms in the multi-armed bandit setting.
Neural Information Processing Systems
Oct-3-2025, 03:01:02 GMT