Safe Reinforcement Learning via Projection on a Safe Set: How to Achieve Optimality?