RMB: Comprehensively Benchmarking Reward Models in LLM Alignment