Activation-Guided Consensus Merging for Large Language Models
–Neural Information Processing Systems
Recent research has increasingly focused on reconciling the reasoning capabilities of System 2 with the efficiency of System 1. While existing training-based and prompt-based approaches face significant challenges in terms of efficiency and stability, model merging emerges as a promising strategy to integrate the diverse capabilities of different Large Language Models (LLMs) into a unified model. However, conventional model merging methods often assume uniform importance across layers, overlooking the functional heterogeneity inherent in neural components. To address this limitation, we propose Activation-Guided Consensus Merging (ACM), a plug-and-play merging framework that determines layer-specific merging coefficients based on mutual information between activations of pre-trained and fine-tuned models. ACM effectively preserves task-specific capabilities without requiring gradient computations or additional training. Extensive experiments on Long-to-Short (L2S) and general merging tasks demonstrate that ACM consistently outperforms all baseline methods. For instance, in the case of Qwen-7B models, TIES-Merging equipped with ACM achieves a 55.3% reduction in response length while simultaneously improving reasoning accuracy by 1.3 points.
Neural Information Processing Systems
Jun-22-2026, 01:57:03 GMT
- Country:
- Europe (0.68)
- North America > United States (0.67)
- Asia > China (0.46)
- Genre:
- Overview (0.67)
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Industry:
- Information Technology > Security & Privacy (0.46)
- Technology: