Continuous Doubly Constrained Batch Reinforcement Learning