Thompson Sampling for Infinite-Horizon Discounted Decision Processes

Open in new window