Privacy-Preserving Reinforcement Learning from Human Feedback via Decoupled Reward Modeling