Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

May-27-2025, 08:46:03 GMT–Neural Information Processing Systems

In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models and (b) heterogeneous data during testing. Traditional model merging methods often show significant performance gaps compared to fine-tuned models due to these issues. Additionally, a one-size-fits-all model lacks flexibility for diverse test data, leading to performance degradation. We show that both shared and exclusive task-specific knowledge are crucial for merging performance, but directly merging exclusive knowledge hinders overall performance.

dynamic integration, modular expertise, twin-merging, (4 more...)

Neural Information Processing Systems

May-27-2025, 08:46:03 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (0.62)
  - Representation & Reasoning (0.42)