Goto

Collaborating Authors

 a-gem




0b5e29aa1acf8bdc5d8935d7036fa4f5-AuthorFeedback.pdf

Neural Information Processing Systems

On the task, all the methods share similar noisy pattern. The43 results show the benefits of adjustingα1 during training. It is shown in [2] that A-GEM has better or comparable44 performance than GEM, so we focus on comparing with A-GEM.





Review for NeurIPS paper: Continual Deep Learning by Functional Regularisation of Memorable Past

Neural Information Processing Systems

What are the real contributions of the paper? The idea of regularizing the outputs (or functional-regularization) has already been explored, as already said in the paper. Combining the idea of regularizing the outputs with memory-based methods is also already explored. Please see GEM [1] and A-GEM [2]. What makes this approach better or important, e.g.


Characterizing Continual Learning Scenarios and Strategies for Audio Analysis

Bhatt, Ruchi, Kumari, Pratibha, Mahapatra, Dwarikanath, Saddik, Abdulmotaleb El, Saini, Mukesh

arXiv.org Artificial Intelligence

Audio analysis is useful in many application scenarios. The state-of-the-art audio analysis approaches assume that the data distribution at training and deployment time will be the same. However, due to various real-life environmental factors, the data may encounter drift in its distribution or can encounter new classes in the late future. Thus, a one-time trained model might not perform adequately. In this paper, we characterize continual learning (CL) approaches in audio analysis. In this paper, we characterize continual learning (CL) approaches, intended to tackle catastrophic forgetting arising due to drifts. As there is no CL dataset for audio analysis, we use DCASE 2020 to 2023 datasets to create various CL scenarios for audio-based monitoring tasks. We have investigated the following CL and non-CL approaches: EWC, LwF, SI, GEM, A-GEM, GDumb, Replay, Naive, cumulative, and joint training. The study is very beneficial for researchers and practitioners working in the area of audio analysis for developing adaptive models. We observed that Replay achieved better results than other methods in the DCASE challenge data. It achieved an accuracy of 70.12% for the domain incremental scenario and an accuracy of 96.98% for the class incremental scenario.


Gradient Episodic Memory with a Soft Constraint for Continual Learning

Hu, Guannan, Zhang, Wu, Ding, Hu, Zhu, Wenhao

arXiv.org Artificial Intelligence

Catastrophic forgetting in continual learning is a common destructive phenomenon in gradient-based neural networks that learn sequential tasks, and it is much different from forgetting in humans, who can learn and accumulate knowledge throughout their whole lives. Catastrophic forgetting is the fatal shortcoming of a large decrease in performance on previous tasks when the model is learning a novel task. To alleviate this problem, the model should have the capacity to learn new knowledge and preserve learned knowledge. We propose an average gradient episodic memory (A-GEM) with a soft constraint $\epsilon \in [0, 1]$, which is a balance factor between learning new knowledge and preserving learned knowledge; our method is called gradient episodic memory with a soft constraint $\epsilon$ ($\epsilon$-SOFT-GEM). $\epsilon$-SOFT-GEM outperforms A-GEM and several continual learning benchmarks in a single training epoch; additionally, it has state-of-the-art average accuracy and efficiency for computation and memory, like A-GEM, and provides a better trade-off between the stability of preserving learned knowledge and the plasticity of learning new knowledge.


Weight Friction: A Simple Method to Overcome Catastrophic Forgetting and Enable Continual Learning

Liu, Gabrielle K.

arXiv.org Machine Learning

In recent years, deep neural networks have found success in replicating human-level cognitive skills, yet they suffer from several major obstacles. One significant limitation is the inability to learn new tasks without forgetting previously learned tasks, a shortcoming known as catastrophic forgetting. In this research, we propose a simple method to overcome catastrophic forgetting and enable continual learning in neural networks. We draw inspiration from principles in neurology and physics to develop the concept of weight friction. Weight friction operates by a modification to the update rule in the gradient descent optimization method. It converges at a rate comparable to that of the stochastic gradient descent algorithm and can operate over multiple task domains. It performs comparably to current methods while offering improvements in computation and memory efficiency.