On Optimistic versus Randomized Exploration in Reinforcement Learning

Open in new window