Convergent Policy Optimization for Safe Reinforcement Learning

Ming Yu, Zhuoran Yang, Mladen Kolar, Zhaoran Wang

Neural Information Processing Systems 

Neural Information Processing Systems http://nips.cc/