The Plug-in Approach for Average-Reward and Discounted MDPs: Optimal Sample Complexity Analysis

Open in new window