Why is Posterior Sampling Better than Optimism for Reinforcement Learning?

Open in new window