AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Neural Information Processing SystemsFeb-16-2026, 19:09:11 GMT

Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training Y anlai Y ang 1, Matt Jones

The behavior emerges and becomes more robust as the architecture scales up its number of parameters.

large language model, machine learning, natural language, (18 more...)

Country:

North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > Dominican Republic (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Neural Information Processing SystemsOct-10-2025, 10:21:46 GMT

Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training Y anlai Y ang 1, Matt Jones

The behavior emerges and becomes more robust as the architecture scales up its number of parameters.

anticipatory recovery, experiment, recovery, (14 more...)

Country:

North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > Dominican Republic (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceOct-8-2025

Understanding Catastrophic Interference: On the Identifibility of Latent Representations

Li, Yuke, Zheng, Yujia, Xiong, Tianyi, Wang, Zhenyi, Huang, Heng

Catastrophic interference, also known as catastrophic forgetting, is a fundamental challenge in machine learning, where a trained learning model progressively loses performance on previously learned tasks when adapting to new ones. In this paper, we aim to better understand and model the catastrophic interference problem from a latent representation learning point of view, and propose a novel theoretical framework that formulates catastrophic interference as an identification problem. Our analysis demonstrates that the forgetting phenomenon can be quantified by the distance between partial-task aware (PTA) and all-task aware (ATA) setups. Building upon recent advances in identifiability theory, we prove that this distance can be minimized through identification of shared latent variables between these setups. When learning, we propose our method \ourmeos with two-stage training strategy: First, we employ maximum likelihood estimation to learn the latent representations from both PTA and ATA configurations. Subsequently, we optimize the KL divergence to identify and learn the shared latent variables. Through theoretical guarantee and empirical validations, we establish that identifying and learning these shared representations can effectively mitigate catastrophic interference in machine learning systems. Our approach provides both theoretical guarantees and practical performance improvements across both synthetic and benchmark datasets.

artificial intelligence, international conference, machine learning, (14 more...)

2509.23027

Country: North America > United States > Maryland (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsMay-27-2025, 09:37:40 GMT

Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training

We explore the training dynamics of neural networks in a structured non-IID setting where documents are presented cyclically in a fixed, repeated sequence. Typically, networks suffer from catastrophic interference when training on a sequence of documents; however, we discover a curious and remarkable property of LLMs finetuned sequentially in this setting: they exhibit anticipatory behavior, recovering from the forgetting on documents before seeing them again. The behavior emerges and becomes more robust as the architecture scales up its number of parameters. Through comprehensive experiments and visualizations, we uncover new insights into training over-parameterized networks in structured environments.

anticipatory recovery, catastrophic interference, reawakening knowledge, (1 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Soures, Nicholas, Helfer, Peter, Daram, Anurag, Pandit, Tej, Kudithipudi, Dhireesha

TACOS: Task Agnostic Continual Learning in Spiking Neural Networks

arXiv.org Artificial IntelligenceAug-16-2024

Catastrophic interference, the loss of previously learned information when learning new information, remains a major challenge in machine learning. Since living organisms do not seem to suffer from this problem, researchers have taken inspiration from biology to improve memory retention in artificial intelligence systems. However, previous attempts to use bio-inspired mechanisms have typically resulted in systems that rely on task boundary information during training and/or explicit task identification during inference, information that is not available in real-world scenarios. Here, we show that neuro-inspired mechanisms such as synaptic consolidation and metaplasticity can mitigate catastrophic interference in a spiking neural network, using only synapse-local information, with no need for task awareness, and with a fixed memory size that does not need to be increased when training on new tasks. Our model, TACOS, combines neuromodulation with complex synaptic dynamics to enable new learning while protecting previous information. We evaluate TACOS on sequential image recognition tasks and demonstrate its effectiveness in reducing catastrophic interference. Our results show that TACOS outperforms existing regularization techniques in domain-incremental learning scenarios. We also report the results of an ablation study to elucidate the contribution of each neuro-inspired mechanism separately.

metaplasticity, taco, task agnostic continual learning, (11 more...)

2409.00021

Country:

North America > United States > Texas > Bexar County > San Antonio (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Hewson, Joshua T. S., Sloman, Sabina J., Dubova, Marina

One system for learning and remembering episodes and rules

arXiv.org Artificial IntelligenceJul-8-2024

Humans can learn individual episodes and generalizable rules and also successfully retain both kinds of acquired knowledge over time. In the cognitive science literature, (1) learning individual episodes and rules and (2) learning and remembering are often both conceptualized as competing processes that necessitate separate, complementary learning systems. Inspired by recent research in statistical learning, we challenge these trade-offs, hypothesizing that they arise from capacity limitations rather than from the inherent incompatibility of the underlying cognitive processes. Using an associative learning task, we show that one system with excess representational capacity can learn and remember both episodes and rules.

artificial intelligence, episode and rule, machine learning, (16 more...)

2407.05884

Country:

North America > United States > Rhode Island > Providence County > Providence (0.05)
North America > United States > Indiana > Monroe County > Bloomington (0.05)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.05)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Artificial IntelligenceMay-5-2024

Exploring a Cognitive Architecture for Learning Arithmetic Equations

Gawin, Cole

The acquisition and performance of arithmetic skills and basic operations such as addition, subtraction, multiplication, and division are essential for daily functioning, and reflect complex cognitive processes. This paper explores the cognitive mechanisms powering arithmetic learning, presenting a neurobiologically plausible cognitive architecture that simulates the acquisition of these skills. I implement a number vectorization embedding network and an associative memory model to investigate how an intelligent system can learn and recall arithmetic equations in a manner analogous to the human brain. I perform experiments that provide insights into the generalization capabilities of connectionist models, neurological causes of dyscalculia, and the influence of network architecture on cognitive performance. Through this interdisciplinary investigation, I aim to contribute to ongoing research into the neural correlates of mathematical cognition in intelligent systems.

architecture, associative memory model, cognitive architecture, (15 more...)

2405.0455

Country:

North America > United States > California (0.14)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education > Curriculum > Subject-Specific Education (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

van Deventer, Heinrich, Bosman, Anna Sergeevna

Distal Interference: Exploring the Limits of Model-Based Continual Learning

arXiv.org Artificial IntelligenceFeb-13-2024

Continual learning is the sequential learning of different tasks by a machine learning model. Continual learning is known to be hindered by catastrophic interference or forgetting, i.e. rapid unlearning of earlier learned tasks when new tasks are learned. Despite their practical success, artificial neural networks (ANNs) are prone to catastrophic interference. This study analyses how gradient descent and overlapping representations between distant input points lead to distal interference and catastrophic interference. Distal interference refers to the phenomenon where training a model on a subset of the domain leads to non-local changes on other subsets of the domain. This study shows that uniformly trainable models without distal interference must be exponentially large. A novel antisymmetric bounded exponential layer B-spline ANN architecture named ABEL-Spline is proposed that can approximate any continuous function, is uniformly trainable, has polynomial computational complexity, and provides some guarantees for distal interference. Experiments are presented to demonstrate the theoretical properties of ABEL-Splines. ABEL-Splines are also evaluated on benchmark regression problems. It is concluded that the weaker distal interference guarantees in ABEL-Splines are insufficient for model-only continual learning. It is conjectured that continual learning with polynomial complexity models requires augmentation of the training data or algorithm.

distal interference, interference, z-spline ann, (15 more...)

2402.08255

Country:

Africa > South Africa > Gauteng > Pretoria (0.04)
North America > United States > New York (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

arXiv.org Artificial IntelligenceDec-13-2023

Toward a More Biologically Plausible Neural Network Model of Latent Cause Inference

Lu, Qihong, Nguyen, Tan T., Zhang, Qiong, Hasson, Uri, Griffiths, Thomas L., Zacks, Jeffrey M., Gershman, Samuel J., Norman, Kenneth A.

Humans spontaneously perceive a continuous stream of experience as discrete events. It has been hypothesized that this ability is supported by latent cause inference (LCI). We implemented this hypothesis using Latent Cause Network (LCNet), a neural network model of LCI. LCNet interacts with a Bayesian LCI mechanism that activates a unique context vector for each inferred latent cause. This architecture makes LCNet more biologically plausible than existing models of LCI and supports extraction of shared structure across latent causes. Across three simulations, we found that LCNet could 1) extract shared structure across latent causes in a function-learning task while avoiding catastrophic interference, 2) capture human data on curriculum effects in schema learning, and 3) infer the underlying event structure when processing naturalistic videos of daily activities. Our work provides a biologically plausible computational model that can operate in both laboratory experiment settings and naturalistic settings, opening up the possibility of providing a unified model of event cognition.

lcnet, representation, vector, (16 more...)

2312.08519

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)