AITopics | Eeckt, Steven Vander

Collaborating Authors

Eeckt, Steven Vander

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Continual Learning With Quasi-Newton Methods

Eeckt, Steven Vander, Van hamme, Hugo

arXiv.org Artificial IntelligenceMar-25-2025

Received 17 February 2025, accepted 5 March 2025, date of publication 13 March 2025, date of current version 21 March 2025. Continual Learning with Quasi-Newton Methods STEVEN VANDER EECKT and HUGO VAN HAMME (Senior, IEEE) Department Electrical Engineering ESAT-PSI, KU Leuven, B-3001 Leuven, Belgium Corresponding author: Steven Vander Eeeckt (e-mail: steven.vandereeckt@esat.kuleuven.be).ABSTRACT Catastrophic forgetting remains a major challenge when neural networks learn tasks sequentially. Elastic Weight Consolidation (EWC) attempts to address this problem by introducing a Bayesian-inspired regularization loss to preserve knowledge of previously learned tasks. However, EWC relies on a Laplace approximation where the Hessian is simplified to the diagonal of the Fisher information matrix, assuming uncorrelated model parameters. This overly simplistic assumption often leads to poor Hessian estimates, limiting its effectiveness. To overcome this limitation, we introduce Continual Learning with Sampled Quasi-Newton (CSQN), which leverages Quasi-Newton methods to compute more accurate Hessian approximations. Experimental results across four benchmarks demonstrate that CSQN consistently outperforms EWC and other state-of-the-art baselines, including rehearsal-based methods. CSQN reduces EWC's forgetting by 50% and improves its performance by 8% on average. Notably, CSQN achieves superior results on three out of four benchmarks, including the most challenging scenarios, highlighting its potential as a robust solution for continual learning.INDEX TERMS artificial neural networks, catastrophic forgetting, continual learning, quasi-Newton methods I. INTRODUCTION Since the 2010s, Artificial Neural Networks (ANNs) have been able to match or even surpass human performance on a wide variety of tasks. However, when presented with a set of tasks to be learned sequentially--a setting referred to as Continual Learning (CL)--ANNs suffer from catastrophic forgetting [1]. Unlike humans, ANNs struggle to retain previously learned knowledge when extending their knowledge. Naively adapting an ANN to a new task generally leads to a deterioration in the network's performance on previous tasks. Many CL methods have been proposed to alleviate catastrophic forgetting. One of the most well-known is Elastic Weight Consolidation (EWC) [2], which approaches CL from a Bayesian perspective. After training on a task, EWC uses Laplace approximation [3] to estimate a posterior distribution over the model parameters for that task. When training on the next task, this posterior is used via a regularization loss to prevent the model from catastrophically forgetting the previous task. To estimate the Hessian, which is needed in the Laplace approximation to measure the (un)certainty of the model parameters, EWC uses the Fisher Information Matrix (FIM). Furthermore, to simplify the computation, EWC assumes that the FIM is approximately diagonal.

artificial intelligence, machine learning, survey article, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2025.3551146

2503.19939

Country:

North America > United States (0.68)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.44)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition

Eeckt, Steven Vander, Van hamme, Hugo

arXiv.org Artificial IntelligenceMar-9-2023

Adapting a trained Automatic Speech Recognition (ASR) model to new tasks results in catastrophic forgetting of old tasks, limiting the model's ability to learn continually and to be extended to new speakers, dialects, languages, etc. Focusing on End-to-End ASR, in this paper, we propose a simple yet effective method to overcome catastrophic forgetting: weight averaging. By simply taking the average of the previous and the adapted model, our method achieves high performance on both the old and new tasks. It can be further improved by introducing a knowledge distillation loss during the adaptation. We illustrate the effectiveness of our method on both monolingual and multilingual ASR. In both cases, our method strongly outperforms all baselines, even in its simplest form.

experiment, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.15282

Country: Europe > Belgium (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Continual Learning for Monolingual End-to-End Automatic Speech Recognition

Eeckt, Steven Vander, Van hamme, Hugo

arXiv.org Machine LearningDec-17-2021

Adapting Automatic Speech Recognition (ASR) models to new domains leads to a deterioration of performance on the original domain(s), a phenomenon called Catastrophic Forgetting (CF). Even monolingual ASR models cannot be extended to new accents, dialects, topics, etc. without suffering from CF, making them unable to be continually enhanced without storing all past data. Fortunately, Continual Learning (CL) methods, which aim to enable continual adaptation while overcoming CF, can be used. In this paper, we implement an extensive number of CL methods for End-to-End ASR and test and compare their ability to extend a monolingual Hybrid CTC-Transformer model across four new tasks. We find that the best performing CL method closes the gap between the fine-tuned model (lower bound) and the model trained jointly on all tasks (upper bound) by more than 40%, while requiring access to only 0.6% of the original data.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2112.09427

Country: Europe > Belgium (0.29)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback