Hard ASH: Sparsity and the right optimizer make a continual learner

Apr-26-2024–arXiv.org Artificial Intelligence

In class incremental learning, neural networks typically suffer from catastrophic forgetting. We show that an MLP featuring a sparse activation function and an adaptive learning rate optimizer can compete with established regularization techniques in the Split-MNIST task. We highlight the effectiveness of the Adaptive SwisH (ASH) activation function in this context and introduce a novel variant, Hard Adaptive SwisH (Hard ASH) to further enhance the learning retention. Continual learning presents a unique challenge for artificial neural networks, particularly in the class incremental setting (Hsu et al., 2019), where a single network must remember old classes that have left the training set. In this paper I explore an overlooked approach that doesn't require any techniques developed specifically for continual learning.

activation function, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Apr-26-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East > Jordan (0.04)

Genre:
- Research Report (0.65)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found