Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging

Wang, Zhixiang, Mao, Zhenyu, Qiao, Yixuan, Wu, Yunfang, Li, Biye

arXiv.org Artificial Intelligence 

Large Language Models (LLMs) have demonstrated impressive capabilities, but their high computational costs pose challenges for customization. Model merging offers a costeffective alternative, yet existing methods suffer from interference among parameters, leading to performance degradation. In this work, we propose Optimal Brain Iterative Merging (OBIM), a novel method designed to mitigate both intra-model and inter-model interference. OBIM consists of two key components: Figure 1: Illustration of inter-model interference. The (1) A saliency measurement mechanism dotted box highlights cases where TIES fails to resolve that evaluates parameter importance based on interference. Approximately 46% of parameters deviate loss changes induced by individual weight alterations, from the original models due to task vector averaging reducing intra-model interference by in the absence of sign conflicts.