Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees

Open in new window