AITopics | retention performance

Collaborating Authors

retention performance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

05cdc7feee41e3572a9a3f4acb773891-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 09:54:28 GMT

Conversely, the opposite scenario is relatively benign.

artificial intelligence, machine learning, similarity, (16 more...)

Neural Information Processing Systems

Country:

South America > Peru > Lima Department > Lima Province > Lima (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Genre: Research Report (0.46)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

05cdc7feee41e3572a9a3f4acb773891-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 17:36:28 GMT

regularization, retention performance, similarity, (14 more...)

Neural Information Processing Systems

Country:

South America > Peru > Lima Department > Lima Province > Lima (0.04)
North America > United States > Missouri > St. Louis County > St. Louis (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

In-Context Learning can Perform Continual Learning Like Humans

Kang, Liuwang, Wang, Fan, Liu, Shaoshan, Chou, Hung-Chyun, Lin, Chuan, Ding, Ning

arXiv.org Artificial IntelligenceSep-30-2025

Large language models (LLMs) can adapt to new tasks via in-context learning (ICL) without parameter updates, making them powerful learning engines for fast adaptation. While extensive research has examined ICL as a few-shot learner, whether it can achieve long-term retention and cross-task knowledge accumulation when multitasks arrive sequentially remains underexplored. Motivated by human memory studies, we investigate the retention characteristics of ICL in multitask settings and extend it to in-context continual learning (ICCL), where continual learning ability emerges through task scheduling and prompt rearrangement. Experiments on Markov-Chain benchmarks demonstrate that, for specific large-language models, ICCL benefits from distributed practice (DP) in a manner analogous to humans, consistently revealing a spacing "sweet spot" for retention. Beyond retention performance, we propose a human-retention similarity metric to quantify how closely a continual-learning (CL) method aligns with human retention dynamics. Using this metric, we show that linear-attention models such as MAMBA and RWKV exhibit particularly human-like retention patterns, despite their retention performance lagging behind that of Transformer-based LLMs. Overall, our results establish ICCL as both cognitively plausible and practically effective, providing an inference-only CL paradigm that mitigates catastrophic forgetting and addresses the stability-plasticity dilemma in conventional CL methods.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2509.22764

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

Add feedback

GRU: Mitigating the Trade-off between Unlearning and Retention for Large Language Models

Wang, Yue, Wang, Qizhou, Liu, Feng, Huang, Wei, Du, Yali, Du, Xiaojiang, Han, Bo

arXiv.org Artificial IntelligenceMar-12-2025

Large language model (LLM) unlearning has demonstrated its essential role in removing privacy and copyright-related responses, crucial for their legal and safe applications. However, the pursuit of complete unlearning often comes with substantial costs due to its compromises in their general functionality, leading to a notorious trade-off between unlearning and retention. In examining the update process for unlearning dynamically, we find gradients hold essential information for revealing this trade-off. In particular, we look at the varying relationship between retention performance and directional disparities between gradients during unlearning. It motivates the sculpting of an update mechanism derived from gradients from two sources, i.e., harmful for retention and useful for unlearning. Accordingly, we propose Gradient Rectified Unlearning (GRU), an enhanced unlearning framework controlling the updating gradients in a geometry-focused and optimization-driven manner such that their side impacts on other, unrelated responses can be minimized. Specifically, GRU derives a closed-form solution to project the unlearning gradient onto the orthogonal space of that gradient harmful for retention, ensuring minimal deviation from its original direction under the condition that overall performance is retained. Comprehensive experiments are conducted to demonstrate that GRU, as a general framework, is straightforward to implement and efficiently enhances a range of baseline methods through its adaptable and compatible characteristics. Additionally, experimental results show its broad effectiveness across a diverse set of benchmarks for LLM unlearning.

arxiv preprint arxiv, gru, retention, (13 more...)

arXiv.org Artificial Intelligence

2503.09117

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Law (0.88)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Disentangling and Mitigating the Impact of Task Similarity for Continual Learning

Hiratani, Naoki

arXiv.org Machine LearningMay-30-2024

Continual learning of partially similar tasks poses a challenge for artificial neural networks, as task similarity presents both an opportunity for knowledge transfer and a risk of interference and catastrophic forgetting. However, it remains unclear how task similarity in input features and readout patterns influences knowledge transfer and forgetting, as well as how they interact with common algorithms for continual learning. Here, we develop a linear teacher-student model with latent structure and show analytically that high input feature similarity coupled with low readout similarity is catastrophic for both knowledge transfer and retention. Conversely, the opposite scenario is relatively benign. Our analysis further reveals that task-dependent activity gating improves knowledge retention at the expense of transfer, while task-dependent plasticity gating does not affect either retention or transfer performance at the over-parameterized limit. In contrast, weight regularization based on the Fisher information metric significantly improves retention, regardless of task similarity, without compromising transfer performance. Nevertheless, its diagonal approximation and regularization in the Euclidean space are much less robust against task similarity. We demonstrate consistent results in a permuted MNIST task with latent variables. Overall, this work provides insights into when continual learning is difficult and how to mitigate it.

regularization, retention performance, similarity, (14 more...)

arXiv.org Machine Learning

2405.20236

Country:

South America > Peru > Lima Department > Lima Province > Lima (0.04)
North America > United States > Missouri > St. Louis County > St. Louis (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Genre: Research Report (0.81)

Industry: Education > Educational Technology > Educational Software (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Knowledge Management > Knowledge Engineering (0.96)

Add feedback