Review for NeurIPS paper: Meta-Consolidation for Continual Learning

Neural Information Processing Systems 

Summary and Contributions: The paper proposes an online continual learning method, MERLIN, that learns a distribution over the task-specific model parameters given a context (task identifiers, etc). VAE is used to model the distribution over the model parameters. More specifically, given a dataset of a task t, the idea is to train'B' separate models. A VAE is then trained, using these'B' model parameters as training points to learn an encoder (mapping the parameters to the latent) and decoder (mapping the latent to model parameters). The standard VAE ELBO is maximized during the training.