Reviews: Gradient Episodic Memory for Continual Learning
–Neural Information Processing Systems
The authors of the manuscript consider the continuum learning setting, where the learner observes a stream of data points from training, which are ordered according to the tasks they belong to, i.e. the learner encounters any data from the next task only after it has observed all the training data for the current one. The authors propose a set of three metrics for evaluation performance of learning algorithms in this setting, which reflect their ability to transfer information to new tasks and not forget information about the earlier tasks. Could the authors, please, comment on the difference between continuum and lifelong learning (the corresponding sentence in line 254 seems incomplete)? The authors also propose a learning method, termed Gradient of Episodic Memory (GEM). The idea of the method is to keep a set of examples from every observed task and make sure that at each update stage the loss on the observed tasks does not increase.
Neural Information Processing Systems
Oct-8-2024, 13:29:24 GMT