meta-consolidation
Meta-Consolidation for Continual Learning
The ability to continuously learn and adapt itself to new tasks, without losing grasp of already acquired knowledge is a hallmark of biological learning systems, which current deep learning systems fall short of. In this work, we present a novel methodology for continual learning called MERLIN: Meta-Consolidation for Continual Learning. We assume that weights of a neural network, for solving task, come from a meta-distribution. This meta-distribution is learned and consolidated incrementally. We operate in the challenging online continual learning setting, where a data point is seen by the model only once. Our experiments with continual learning benchmarks of MNIST, CIFAR-10, CIFAR-100 and Mini-ImageNet datasets show consistent improvement over five baselines, including a recent state-of-the-art, corroborating the promise of MERLIN.
Review for NeurIPS paper: Meta-Consolidation for Continual Learning
Summary and Contributions: The paper proposes an online continual learning method, MERLIN, that learns a distribution over the task-specific model parameters given a context (task identifiers, etc). VAE is used to model the distribution over the model parameters. More specifically, given a dataset of a task t, the idea is to train'B' separate models. A VAE is then trained, using these'B' model parameters as training points to learn an encoder (mapping the parameters to the latent) and decoder (mapping the latent to model parameters). The standard VAE ELBO is maximized during the training.
Meta-Consolidation for Continual Learning
The ability to continuously learn and adapt itself to new tasks, without losing grasp of already acquired knowledge is a hallmark of biological learning systems, which current deep learning systems fall short of. In this work, we present a novel methodology for continual learning called MERLIN: Meta-Consolidation for Continual Learning. We assume that weights of a neural network, for solving task, come from a meta-distribution. This meta-distribution is learned and consolidated incrementally. We operate in the challenging online continual learning setting, where a data point is seen by the model only once.