Governance Challenges in Reinforcement Learning from Human Feedback: Evaluator Rationality and Reinforcement Stability