Regret Bounds for Information-Directed Reinforcement Learning
–Neural Information Processing Systems
As a result, IDS demonstrates impressive empirical performance [Russo and V an Roy, 2018] and outperforms UCB and TS in terms of asymptotic optimality [Kirschner et al.,
Neural Information Processing Systems
Aug-18-2025, 03:30:45 GMT