Estimating the Maximum Expected Value in Continuous Reinforcement Learning Problems