Near-Optimal Randomized Exploration for Tabular Markov Decision Processes

Neural Information Processing Systems 

We study algorithms using randomized value functions for exploration in reinforcement learning. This type of algorithms enjoys appealing empirical performance.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found