SE-Merging: A Self-Enhanced Approach for Dynamic Model Merging
Chen, Zijun, Zhou, Zhanpeng, Zhang, Bo, Zhang, Weinan, Sun, Xi, Yan, Junchi
–arXiv.org Artificial Intelligence
Model merging has gained increasing attention due to its intriguing property: interpolating the parameters of different task-specific fine-tuned models leads to multi-task abilities. However, despite its empirical success, the underlying mechanisms of model merging remain poorly understood. In this work, we delve into the mechanism behind model merging from a representation perspective. Our analysis reveals that model merging achieves multi-task abilities through two key capabilities: i) distinguishing samples from different tasks, and ii) adapting to the corresponding expert model for each sample. These two capabilities allow the merged model to retain task-specific expertise, enabling efficient multi-task adaptation. Building on these insights, we propose \texttt{SE-Merging}, a self-enhanced model merging framework that leverages these two characteristics to dynamically identify the corresponding task for each sample and then adaptively rescales the merging coefficients to further enhance task-specific expertise in the merged model. Notably, \texttt{SE-Merging} achieves dynamic model merging without additional training. Extensive experiments demonstrate that \texttt{SE-Merging} achieves significant performance improvements while remaining compatible with existing model merging techniques.
arXiv.org Artificial Intelligence
Jun-24-2025
- Country:
- Asia > China
- Chongqing Province > Chongqing (0.04)
- Hong Kong (0.04)
- Shanghai > Shanghai (0.05)
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Czechia > Prague (0.04)
- Belgium > Brussels-Capital Region
- North America > United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Texas > Travis County
- Austin (0.04)
- Washington > King County
- Seattle (0.04)
- Louisiana > Orleans Parish
- Asia > China
- Genre:
- Research Report (0.82)
- Technology: