Model Merging with Functional Dual Anchors

Oct-27-2025–arXiv.org Artificial Intelligence

Model merging is an efficient post-training strategy for integrating knowledge from multiple finetuned checkpoints of a shared foundation model. Existing methods operate in the parameter space, combining task vectors to mitigate conflicts, but remain constrained by parameter inconsistencies. We propose Functional Dual Anchors (FDAs), a framework that instead models the input-representation space. FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pretrained model. We further introduce a principled initialization scheme and show that FDAs are complementary to parameter-space model merging. Comprehensive experiments demonstrate the effectiveness of FDAs in model merging. Model merging has emerged as a promising post-training strategy for integrating knowledge from multiple finetuned checkpoints of foundation models. The core idea is to combine diverse domain knowledge from multiple homologous downstream models into a single unified one (Matena & Raffel, 2022; Jin et al., 2022). Compared to multi-task learning (Ruder, 2017) and continual learning (Wang et al., 2024), model merging is appealing because it consolidates knowledge directly through the parameters of downstream models finetuned from the same pretrained backbone. On the left, we compare multi-task joint training, task arithmetic and FDA. Inspired by joint training, FDA models the knowledge in the input space.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Oct-27-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Representation & Reasoning > Information Fusion (0.68)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found