Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation

Neural Information Processing Systems 

However, safe RL often suffers from sample inefficiency, requiring extensive interactions with the environment to learn a safe policy.