The Plug-in Approach for Average-Reward and Discounted MDPs: Optimal Sample Complexity Analysis