R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Open in new window