ALaRM: Align Language Models via Hierarchical Rewards Modeling