p |S|3|A|K) 0 OptPess-PrimalDual O(H

Neural Information Processing Systems 

We address the issue of safety in reinforcement learning. We pose the problem in an episodic framework of a constrained Markov decision process.