On Understanding of the Dynamics of Model Capacity in Continual Learning
Chakraborty, Supriyo, Raghavan, Krishnan
–arXiv.org Artificial Intelligence
The stability-plasticity dilemma, closely related to a neural network's (NN) capacity-its ability to represent tasks-is a fundamental challenge in continual learning (CL). Within this context, we introduce CL's effective model capacity (CLEMC) that characterizes the dynamic behavior of the stability-plasticity balance point. We develop a difference equation to model the evolution of the interplay between the NN, task data, and optimization procedure. We then leverage CLEMC to demonstrate that the effective capacity-and, by extension, the stability-plasticity balance point is inherently non-stationary. We show that regardless of the NN architecture or optimization method, a NN's ability to represent new tasks diminishes when incoming task distributions differ from previous ones. We conduct extensive experiments to support our theoretical findings, spanning a range of architectures-from small feedforward network and convolutional networks to medium-sized graph neural networks and transformer-based large language models with millions of parameters.
arXiv.org Artificial Intelligence
Aug-15-2025
- Country:
- North America > United States (0.46)
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Energy (0.87)
- Government > Regional Government (0.46)
- Technology: