Strategyproof Reinforcement Learning from Human Feedback