Continual Learning Beyond a Single Model
Doan, Thang, Mirzadeh, Seyed Iman, Farajtabar, Mehrdad
–arXiv.org Artificial Intelligence
A growing body of research in continual learning focuses on the catastrophic forgetting problem. While many attempts have been made to alleviate this problem, the majority of the methods assume a single model in the continual learning setup. In this work, we question this assumption and show that employing ensemble models can be a simple yet effective method to improve continual performance. However, ensembles' training and inference costs can increase significantly as the number of models grows. Motivated by this limitation, we study different ensemble models to understand their benefits and drawbacks in continual learning scenarios. Finally, to overcome the high compute cost of ensembles, we leverage recent advances in neural network subspace to propose a computationally cheap algorithm with similar runtime to a single model yet enjoying the performance benefits of ensembles. Continual learning (CL) and Lifelong learning (Thrun, 1994) have recently gained popularity since many real-world applications fall into that setting. It describes the scenario where not only a stream of data arrives sequentially, but their distribution also changes over time. This setup induces Catastrophic Forgetting (CF) (McCloskey & Cohen, 1989) which is a degradation of performances on previous data due to distribution shift between tasks (Doan et al., 2021). One fundamental goal in continual learning is to learn from the new incoming tasks while retaining knowledge from the past and avoiding interference that can lead to poor performance (Lesort et al., 2021). This becomes particularly challenging when the stream of data increases because all the burden is left to a single model. A simple yet effective solution is to rely on an ensemble method that improves performance over a single model. Inspired by bootstrapping (Breiman, 1996), deep ensembles initialize and train multiple neural networks independently (Lakshminarayanan et al., 2017; Fort et al., 2019).
arXiv.org Artificial Intelligence
Jul-3-2023
- Country:
- North America (0.46)
- Genre:
- Research Report (1.00)
- Industry:
- Education (0.68)
- Health & Medicine (0.46)
- Technology: