Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment

Open in new window