Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization