A Reinforcement Learning Approach to Estimating Long-term Treatment Effects