AITopics | task-specific knowledge

Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

Neural Information Processing SystemsMar-21-2026, 14:24:32 GMT

In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models and (b) heterogeneous data during testing. Traditional model merging methods often show significant performance gaps compared to fine-tuned models due to these issues. Additionally, a one-size-fits-all model lacks flexibility for diverse test data, leading to performance degradation. We show that both shared and exclusive task-specific knowledge are crucial for merging performance, but directly merging exclusive knowledge hinders overall performance. In view of this, we propose Twin-Merging, a method that encompasses two principal stages: (1) modularizing knowledge into shared and exclusive components, with compression to reduce redundancy and enhance efficiency; (2) dynamically merging shared and task-specific knowledge based on the input. This approach narrows the performance gap between merged and fine-tuned models and improves adaptability to heterogeneous data. Extensive experiments on $20$ datasets for both language and vision tasks demonstrate the effectiveness of our method, showing an average improvement of $28.34\%$ in absolute normalized score for discriminative tasks and even surpassing the fine-tuned upper bound on the generative tasks.

artificial intelligence, natural language, proceedings, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.59)
Information Technology > Artificial Intelligence > Natural Language (0.59)

Add feedback

8fcd17eb91bae20d9826786d7d6be799-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 15:04:09 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
Asia > Singapore (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.92)

Industry:

Banking & Finance (0.67)
Law (0.46)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph

Neural Information Processing SystemsFeb-9-2026, 11:14:55 GMT

GraphAdapter, which performs the textual adapter by explicitly modeling the dual-modality structure knowledge ( i.e., the correlation of different semantics/classes

large language model, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Singapore (0.04)
Asia > Middle East > Israel (0.04)
(2 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models

Neural Information Processing SystemsDec-25-2025, 00:17:52 GMT

Test-time adaptation, which enables models to generalize to diverse data with unlabeled test samples, holds significant value in real-world scenarios. Recently, researchers have applied this setting to advanced pre-trained vision-language models (VLMs), developing approaches such as test-time prompt tuning to further extend their practical applicability. However, these methods typically focus solely on adapting VLMs from a single modality and fail to accumulate task-specific knowledge as more samples are processed. To address this, we introduce Dual Prototype Evolving (DPE), a novel test-time adaptation approach for VLMs that effectively accumulates task-specific knowledge from multi-modalities. Specifically, we create and evolve two sets of prototypes--textual and visual--to progressively capture more accurate multi-modal representations for target classes during test time. Moreover, to promote consistent multi-modal representations, we introduce and optimize learnable residuals for each test sample to align the prototypes from both modalities. Extensive experimental results on 15 benchmark datasets demonstrate that our proposed DPE consistently outperforms previous state-of-the-art methods while also exhibiting competitive computational efficiency.

artificial intelligence, dual prototype evolving, natural language, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.67)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph

Neural Information Processing SystemsDec-24-2025, 09:01:53 GMT

Adapter-style efficient transfer learning (ETL) has shown excellent performance in the tuning of vision-language models (VLMs) under the low-data regime, where only a few additional parameters are introduced to excavate the task-specific knowledge based on the general and powerful representation of VLMs. However, most adapter-style works face two limitations: (i) modeling task-specific knowledge with a single modality only; and (ii) overlooking the exploitation of the inter-class relationships in downstream tasks, thereby leading to sub-optimal solutions. To mitigate that, we propose an effective adapter-style tuning strategy, dubbed GraphAdapter, which performs the textual adapter by explicitly modeling the dual-modality structure knowledge (i.e., the correlation of different semantics/classes in textual and visual modalities) with a dual knowledge graph. In particular, the dual knowledge graph is established with two sub-graphs, i.e., a textual knowledge sub-graph, and a visual knowledge sub-graph, where the nodes and edges represent the semantics/classes and their correlations in two modalities, respectively. This enables the textual feature of each prompt to leverage the task-specific structure knowledge from both textual and visual modalities, yielding a more effective classifier for downstream tasks. Extensive experimental results on 11 benchmark datasets reveal that our GraphAdapter significantly outperforms the previous adapter-based methods.

graphadapter, name change, tuning vision-language model, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

Neural Information Processing SystemsOct-10-2025, 09:29:51 GMT

This unified model can quickly adapt to new tasks without the need to retrain a large model.

dataset, knowledge, twin-merging, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
Asia > Singapore (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.92)

Industry:

Banking & Finance (0.67)
Law (0.46)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Bisecle: Binding and Separation in Continual Learning for Video Language Understanding

Tan, Yue, Hu, Xiaoqian, Xue, Hao, De Melo, Celso, Salim, Flora D.

arXiv.org Artificial IntelligenceJul-2-2025

Frontier vision-language models (VLMs) have made remarkable improvements in video understanding tasks. However, real-world videos typically exist as continuously evolving data streams (e.g., dynamic scenes captured by wearable glasses), necessitating models to continually adapt to shifting data distributions and novel scenarios. Considering the prohibitive computational costs of fine-tuning models on new tasks, usually, a small subset of parameters is updated while the bulk of the model remains frozen. This poses new challenges to existing continual learning frameworks in the context of large multimodal foundation models, i.e., catastrophic forgetting and update conflict. While the foundation models struggle with parameter-efficient continual learning, the hippocampus in the human brain has evolved highly efficient mechanisms for memory formation and consolidation. Inspired by the rapid Binding and pattern separation mechanisms in the hippocampus, in this work, we propose Bisecle for video-language continual learning, where a multi-directional supervision module is used to capture more cross-modal relationships and a contrastive prompt learning scheme is designed to isolate task-specific knowledge to facilitate efficient memory storage. Binding and separation processes further strengthen the ability of VLMs to retain complex experiences, enabling robust and efficient continual learning in video understanding tasks. We perform a thorough evaluation of the proposed Bisecle, demonstrating its ability to mitigate forgetting and enhance cross-task generalization on several VideoQA benchmarks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.00469

Country: Oceania > Australia > New South Wales (0.14)

Genre: Research Report > Promising Solution (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Information-Theoretic Complementary Prompts for Improved Continual Text Classification

Zhang, Duzhen, Ren, Yong, Li, Chenxing, Yu, Dong, Zhang, Tielin

arXiv.org Artificial IntelligenceMay-28-2025

Continual Text Classification (CTC) aims to continuously classify new text data over time while minimizing catastrophic forgetting of previously acquired knowledge. However, existing methods often focus on task-specific knowledge, overlooking the importance of shared, task-agnostic knowledge. Inspired by the complementary learning systems theory, which posits that humans learn continually through the interaction of two systems -- the hippocampus, responsible for forming distinct representations of specific experiences, and the neocortex, which extracts more general and transferable representations from past experiences -- we introduce Information-Theoretic Complementary Prompts (InfoComp), a novel approach for CTC. InfoComp explicitly learns two distinct prompt spaces: P(rivate)-Prompt and S(hared)-Prompt. These respectively encode task-specific and task-invariant knowledge, enabling models to sequentially learn classification tasks without relying on data replay. To promote more informative prompt learning, InfoComp uses an information-theoretic framework that maximizes mutual information between different parameters (or encoded representations). Within this framework, we design two novel loss functions: (1) to strengthen the accumulation of task-specific knowledge in P-Prompt, effectively mitigating catastrophic forgetting, and (2) to enhance the retention of task-invariant knowledge in S-Prompt, improving forward knowledge transfer. Extensive experiments on diverse CTC benchmarks show that our approach outperforms previous state-of-the-art methods.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.20933

Country:

Asia > China (0.46)
Asia > Middle East > UAE (0.46)
North America > United States (0.28)
North America > Canada (0.28)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.93)
Research Report > Promising Solution (0.68)

Industry:

Information Technology > Security & Privacy (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

Neural Information Processing SystemsMay-27-2025, 08:46:03 GMT

In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models and (b) heterogeneous data during testing. Traditional model merging methods often show significant performance gaps compared to fine-tuned models due to these issues. Additionally, a one-size-fits-all model lacks flexibility for diverse test data, leading to performance degradation. We show that both shared and exclusive task-specific knowledge are crucial for merging performance, but directly merging exclusive knowledge hinders overall performance.

dynamic integration, modular expertise, twin-merging, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.62)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.42)

Add feedback

Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models

Neural Information Processing SystemsMay-26-2025, 21:28:39 GMT

Test-time adaptation, which enables models to generalize to diverse data with unlabeled test samples, holds significant value in real-world scenarios. Recently, researchers have applied this setting to advanced pre-trained vision-language models (VLMs), developing approaches such as test-time prompt tuning to further extend their practical applicability. However, these methods typically focus solely on adapting VLMs from a single modality and fail to accumulate task-specific knowledge as more samples are processed. To address this, we introduce Dual Prototype Evolving (DPE), a novel test-time adaptation approach for VLMs that effectively accumulates task-specific knowledge from multi-modalities. Specifically, we create and evolve two sets of prototypes--textual and visual--to progressively capture more accurate multi-modal representations for target classes during test time. Moreover, to promote consistent multi-modal representations, we introduce and optimize learnable residuals for each test sample to align the prototypes from both modalities.

artificial intelligence, dual prototype evolving, natural language, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.65)
Information Technology > Artificial Intelligence > Natural Language (0.65)

Add feedback

Filters

Collaborating Authors

task-specific knowledge

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

8fcd17eb91bae20d9826786d7d6be799-Paper-Conference.pdf

GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph

Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models

GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph

Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

Bisecle: Binding and Separation in Continual Learning for Video Language Understanding

Information-Theoretic Complementary Prompts for Improved Continual Text Classification

Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models