Posterior Sampling for Reinforcement Learning Without Episodes

Open in new window