k-prior
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Asia > Middle East > Jordan (0.04)
Compact Memory for Continual Logistic Regression
Jung, Yohan, Lee, Hyungi, Chen, Wenlong, Möllenhoff, Thomas, Li, Yingzhen, Lee, Juho, Khan, Mohammad Emtiyaz
Despite recent progress, continual learning still does not match the performance of batch training. To avoid catastrophic forgetting, we need to build compact memory of essential past knowledge, but no clear solution has yet emerged, even for shallow neural networks with just one or two layers. In this paper, we present a new method to build compact memory for logistic regression. Our method is based on a result by Khan and Swaroop [2021] who show the existence of optimal memory for such models. We formulate the search for the optimal memory as Hessian-matching and propose a probabilistic PCA method to estimate them. Our approach can drastically improve accuracy compared to Experience Replay. For instance, on Split-ImageNet, we get 60% accuracy compared to 30% obtained by replay with memory-size equivalent to 0.3% of the data size. Increasing the memory size to 2% further boosts the accuracy to 74%, closing the gap to the batch accuracy of 77.6% on this task. Our work opens a new direction for building compact memory that can also be useful in the future for continual deep learning.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Government (0.46)
- Information Technology (0.46)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Asia > Middle East > Jordan (0.04)
Knowledge-Adaptation Priors
Khan, Mohammad Emtiyaz, Swaroop, Siddharth
Humans and animals have a natural ability to quickly adapt to their surroundings, but machine-learning models, when subjected to changes, often require a complete retraining from scratch. We present Knowledge-adaptation priors (K-priors) to reduce the cost of retraining by enabling quick and accurate adaptation for a wide-variety of tasks and models. This is made possible by a combination of weight and function-space priors to reconstruct the gradients of the past, which recovers and generalizes many existing, but seemingly-unrelated, adaptation strategies. Training with simple first-order gradient methods can often recover the exact retrained model to an arbitrary accuracy by choosing a sufficiently large memory of the past data. Empirical results confirm that the adaptation can be cheap and accurate, and a promising alternative to retraining.
- North America > United States (0.47)
- Asia (0.46)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Education (0.68)
- Information Technology (0.46)