Goto

Collaborating Authors

 parameter distribution





9e9f0ffc3d836836ca96cbf8fe14b105-Supplemental-Conference.pdf

Neural Information Processing Systems

Inanutshell, features ofthis dataset are sampled randomly fromN(0,1), and the target is produced by an ensemble of randomly constructed decision trees applied to the sampledfeatures. Our dataset has10,000 objects, 8 features and the target was produced by16 decision trees of depth6. CatBoost is trained with the default hyperparameters. Importantly,thelattermeans that this approach is not covered by the embedding framework described in subsection 3.1. So, it seems to be important to embed each feature separately as describedinsubsection3.1.







Posterior Meta-Replay for Continual Learning

Neural Information Processing Systems

In principle, Bayesian learning directly applies to this setting, since recursive and one-off Bayesian updates yield the same result. In practice, however, recursive updating often leads to poor trade-off solutions across tasks because approximate inference is necessary for most models of interest. Here, we describe an alternative Bayesian approach where task-conditioned parameter distributions are continually inferred from data. We offer a practical deep learning implementation of our framework based on probabilistic task-conditioned hypernetworks, an approach we term posterior meta-replay. Experiments on standard benchmarks show that our probabilistic hypernetworks compress sequences of posterior parameter distributions with virtually no forgetting. We obtain considerable performance gains compared to existing Bayesian CL methods, and identify task inference as our major limiting factor. This limitation has several causes that are independent of the considered sequential setting, opening up new avenues for progress in CL.