Appendix: Continuous Doubly Constrained Batch Reinforcement Learning A Experiment Details Evaluation Procedure