Theory on Forgetting and Generalization of Continual Learning

Lin, Sen, Ju, Peizhong, Liang, Yingbin, Shroff, Ness

Feb-11-2023–arXiv.org Artificial Intelligence

Continual learning (CL) [41] is a learning paradigm where an agent needs to continuously learn a sequence of tasks. To resemble the extraordinary lifelong learning capability of human beings, the agent is expected to learn new tasks more easily based on accumulated knowledge from old tasks, and further improve the learning performance of old tasks by leveraging the knowledge of new tasks. The former is referred to as forward knowledge transfer and the latter as backward knowledge transfer. One major challenge herein is the so-called catastrophic forgetting [36], i.e., the agent easily forgets the knowledge of old tasks when learning new tasks. Although there have been significant efforts in experimental studies (e.g., [27, 14, 50, 16, 17]) to address the forgetting issue, the theoretical understanding of CL is still in the early stage, where only a few attempts have emerged recently, e.g., [49, 12, 16, 17] (see a more detailed discussion about the previous theoretical studies of CL in Section 2). However, none of these existing theoretical results provide an explicit characterization of forgetting and generalization error, that only depends on fundamental system parameters/setups (e.g., number of tasks/samples/parameters, noise level, task similarity/order). Thus, our work here provides the first-known explicit theoretical result, which enables us to comprehensively understand which factors are relevant and how they (precisely) affect forgetting and generalization error of CL. Our main contributions can be summarized as follows. First, we provide theoretical results on the expected value of forgetting and overall generalization error in CL, under a linear regression setup with i.i.d.

artificial intelligence, machine learning, task order, (16 more...)

arXiv.org Artificial Intelligence

Feb-11-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.66)

Industry:
- Education (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Statistical Learning (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found