Regret Bounds for Information-Directed Reinforcement Learning

Neural Information Processing Systems 

As a result, IDS demonstrates impressive empirical performance [Russo and V an Roy, 2018] and outperforms UCB and TS in terms of asymptotic optimality [Kirschner et al.,

Similar Docs  Excel Report  more

TitleSimilaritySource
None found