Integral Continual Learning Along the Tangent Vector Field of Tasks

Liu, Tian Yu, Golatkar, Aditya, Soatto, Stefano, Achille, Alessandro

Dec-11-2023–arXiv.org Artificial Intelligence

We propose a lightweight continual learning method which incorporates information from specialized datasets incrementally, by integrating it along the vector field of "generalist" models. The tangent plane to the specialist model acts as a generalist guide and avoids the kind of over-fitting that leads to catastrophic forgetting, while exploiting the convexity of the optimization landscape in the tangent plane. It maintains a small fixed-size memory buffer, as low as 0.4% of the source datasets, which is updated by simple resampling. Our method achieves strong performance across various buffer sizes for different datasets. Specifically, in the class-incremental setting we outperform the existing methods that do not require distillation by an average of 18.77% and 28.48%, for Seq-CIFAR-10 and Seq-TinyImageNet respectively. Our method can easily be used in conjunction with existing replay-based continual learning methods. When memory buffer constraints are relaxed to allow storage of metadata such as logits, we attain an error reduction of 17.84% towards the paragon performance on Seq-CIFAR-10.

artificial intelligence, buffer, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Dec-11-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre:
- Research Report (0.82)

Industry:
- Education (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)