A Tale of Sampling and Estimation in Discounted Reinforcement Learning