SE-Merging: A Self-Enhanced Approach for Dynamic Model Merging

Chen, Zijun, Zhou, Zhanpeng, Zhang, Bo, Zhang, Weinan, Sun, Xi, Yan, Junchi

Jun-24-2025–arXiv.org Artificial Intelligence

Model merging has gained increasing attention due to its intriguing property: interpolating the parameters of different task-specific fine-tuned models leads to multi-task abilities. However, despite its empirical success, the underlying mechanisms of model merging remain poorly understood. In this work, we delve into the mechanism behind model merging from a representation perspective. Our analysis reveals that model merging achieves multi-task abilities through two key capabilities: i) distinguishing samples from different tasks, and ii) adapting to the corresponding expert model for each sample. These two capabilities allow the merged model to retain task-specific expertise, enabling efficient multi-task adaptation. Building on these insights, we propose \texttt{SE-Merging}, a self-enhanced model merging framework that leverages these two characteristics to dynamically identify the corresponding task for each sample and then adaptively rescales the merging coefficients to further enhance task-specific expertise in the merged model. Notably, \texttt{SE-Merging} achieves dynamic model merging without additional training. Extensive experiments demonstrate that \texttt{SE-Merging} achieves significant performance improvements while remaining compatible with existing model merging techniques.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Jun-24-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Chongqing Province > Chongqing (0.04)
  - Hong Kong (0.04)
  - Shanghai > Shanghai (0.05)
- Europe
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
  - Czechia > Prague (0.04)
- North America > United States
  - Louisiana > Orleans Parish
    - New Orleans (0.04)
  - Texas > Travis County
    - Austin (0.04)
  - Washington > King County
    - Seattle (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)
  - Natural Language (1.00)