Bias Fitting to Mitigate Length Bias of Reward Model in RLHF