RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style

Open in new window