A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning

Open in new window