Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Matteo Turchetta, Felix Berkenkamp, Andreas Krause

Neural Information Processing Systems 

MDP, for this task and prove that it completely explores the safely reachable part of the MDP without violating the safety constraint.