Theoretical Analysis of KL-regularized RLHF with Multiple Reference Models