The Trickle-down Impact of Reward (In-)consistency on RLHF