Reviews: Safe Model-based Reinforcement Learning with Stability Guarantees

Neural Information Processing Systems 

My understanding of the paper: This paper describes a novel algorithm for safe model-based control of a unknown system. This is an important problem space, and I am happy to see new contributions. The proposed approach uses a learnt model of the system, and constrains the policy to avoid actions that could bring the system to un-safe states. Additionally, both policy actions and exploratory actions are affected by this safety constraint. The proposed algorithm is based on reasonable assumptions of Lipschitz continuity within the system dynamics as well as the presence of a Lyapunov function, which provides some quantification of risk-related cost of a state.