MetaRM: Shifted Distributions Alignment via Meta-Learning