Carousel Memory: Rethinking the Design of Episodic Memory for Continual Learning
Lee, Soobee, Weerakoon, Minindu, Choi, Jonghyun, Zhang, Minjia, Wang, Di, Jeon, Myeongjae
–arXiv.org Artificial Intelligence
Continual Learning (CL) is an emerging machine learning paradigm that aims to learn from a continuous stream of tasks without forgetting knowledge learned from the previous tasks. To avoid performance decrease caused by forgetting, prior studies exploit episodic memory (EM), which stores a subset of the past observed samples while learning from new non-i.i.d. Despite the promising results, since CL is often assumed to execute on mobile or IoT devices, the EM size is bounded by the small hardware memory capacity and makes it infeasible to meet the accuracy requirements for real-world applications. Specifically, all prior CL methods discard samples overflowed from the EM and can never retrieve them back for subsequent training steps, incurring loss of information that would exacerbate catastrophic forgetting. We explore a novel hierarchical EM management strategy to address the forgetting issue. In particular, in mobile and IoT devices, real-time data can be stored not just in high-speed RAMs but in internal storage devices as well, which offer significantly larger capacity than the RAMs. Based on this insight, we propose to exploit the abundant storage to preserve past experiences and alleviate the forgetting by allowing CL to efficiently migrate samples between memory and storage without being interfered by the slow access speed of the storage. We call it Carousel Memory (CarM). As CarM is complementary to existing CL methods, we conduct extensive evaluations of our method with seven popular CL methods and show that CarM significantly improves the accuracy of the methods across different settings by large margins in final average accuracy (up to 28.4%) while retaining the same training efficiency. With the rising demand for realistic on-device machine learning, recent years have witnessed a novel learning paradigm, namely continual learning (CL), for training neural networks (NN) with a stream of non-i.i.d. In such a paradigm, the neural network is incrementally learned with insertions of new tasks (e.g., a set of classes) (Rebuffi et al., 2017). The NN model is expected to continuously learn new knowledge from new tasks over time while retaining previously learned knowledge, which is a closer representation of how intelligent systems operate in the real world. In this learning setup, the knowledge should be acquired not only from the new data timely but also in a computationally efficient manner. In this regard, CL is suitable for learning on mobile and IoT devices (Hayes et al., 2020; Wang et al., 2019).
arXiv.org Artificial Intelligence
Oct-14-2021
- Genre:
- Research Report (1.00)
- Industry:
- Education > Educational Setting
- Online (0.46)
- Health & Medicine > Consumer Health (0.71)
- Education > Educational Setting
- Technology: