Evaluating Reinforcement Learning Algorithms in Observational Health Settings