Intelligently Weighting Multiple Reference Models for Direct Preference Optimization of LLMs