Provable Multi-Party Reinforcement Learning with Diverse Human Feedback