IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards

Open in new window