A Provable Approach for End-to-End Safe Reinforcement Learning