oe 1
The Joint Effect of Task Similarity and Overparameterization on Catastrophic Forgetting -- An Analytical Model
Goldfarb, Daniel, Evron, Itay, Weinberger, Nir, Soudry, Daniel, Hand, Paul
In continual learning, catastrophic forgetting is affected by multiple aspects of the tasks. Previous works have analyzed separately how forgetting is affected by either task similarity or overparameterization. In contrast, our paper examines how task similarity and overparameterization jointly affect forgetting in an analyzable model. Specifically, we focus on two-task continual linear regression, where the second task is a random orthogonal transformation of an arbitrary first task (an abstraction of random permutation tasks). We derive an exact analytical expression for the expected forgetting - and uncover a nuanced pattern. In highly overparameterized models, intermediate task similarity causes the most forgetting. However, near the interpolation threshold, forgetting decreases monotonically with the expected task similarity. We validate our findings with linear regression on synthetic data, and with neural networks on established permutation task benchmarks.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- Asia > Middle East > Israel (0.04)
Unifying Model-Based and Neural Network Feedforward: Physics-Guided Neural Networks with Linear Autoregressive Dynamics
Kon, Johan, Bruijnen, Dennis, van de Wijdeven, Jeroen, Heertjes, Marcel, Oomen, Tom
Unknown nonlinear dynamics often limit the tracking performance of feedforward control. The aim of this paper is to develop a feedforward control framework that can compensate these unknown nonlinear dynamics using universal function approximators. The feedforward controller is parametrized as a parallel combination of a physics-based model and a neural network, where both share the same linear autoregressive (AR) dynamics. This parametrization allows for efficient output-error optimization through Sanathanan-Koerner (SK) iterations. Within each SK-iteration, the output of the neural network is penalized in the subspace of the physics-based model through orthogonal projection-based regularization, such that the neural network captures only the unmodelled dynamics, resulting in interpretable models.
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- North America > Mexico > Quintana Roo > Cancún (0.04)