Efficient Online Learning with Offline Datasets for Infinite Horizon MDPs: A Bayesian Approach

Open in new window