Safe Reinforcement Learning with Dead-Ends Avoidance and Recovery