FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

Yang, Ziyi, Wan, Fanqi, Zhong, Longguang, Huang, Canbin, Liang, Guosheng, Quan, Xiaojun

arXiv.org Artificial Intelligence 

We introduce FuseChat-3.0, a suite of large language models (LLMs) developed by integrating the strengths of heterogeneous source LLMs into more compact target LLMs. Our source models include the powerful Gemma-2-27B-it, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, and Llama-3.1-70B-Instruct. To leverage the diverse capabilities of these source models, we develop a specialized data construction protocol tailored to various tasks and domains. The FuseChat-3.0 training pipeline consists of two key stages: (1) supervised fine-tuning (SFT) to align the target and source model distributions, and (2) Direct Preference Optimization (DPO) to apply preferences from multiple source LLMs to fine-tune the target model. As illustrated in Figure 1, using Llama-3.1-8B-Instruct as the target model, our fusion approach achieves an average improvement of 6.8 points across 14 benchmarks. Moreover, it demonstrates remarkable gains of 37.1 points and 30.1 points on the instruction-following benchmarks AlpacaEval-2 and Arena-Hard, respectively. Combining the strengths of multiple large language models (LLMs) provides a powerful means to enhance performance, robustness, and generalization across diverse tasks by leveraging the unique expertise and knowledge each model offers. Individual LLMs, particularly those constrained by size or training data, may perform well in specific areas but struggle in others due to specialization gaps. For instance, one model might excel at generating creative content but lack precision in technical explanations, while another delivers technical accuracy but struggles with conversational fluency.