Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging

Wang, Zhixiang, Mao, Zhenyu, Qiao, Yixuan, Wu, Yunfang, Li, Biye

Feb-17-2025–arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated impressive capabilities, but their high computational costs pose challenges for customization. Model merging offers a costeffective alternative, yet existing methods suffer from interference among parameters, leading to performance degradation. In this work, we propose Optimal Brain Iterative Merging (OBIM), a novel method designed to mitigate both intra-model and inter-model interference. OBIM consists of two key components: Figure 1: Illustration of inter-model interference. The (1) A saliency measurement mechanism dotted box highlights cases where TIES fails to resolve that evaluates parameter importance based on interference. Approximately 46% of parameters deviate loss changes induced by individual weight alterations, from the original models due to task vector averaging reducing intra-model interference by in the absence of sign conflicts.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

Feb-17-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report
  - New Finding (1.00)
  - Promising Solution (0.68)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)