Principled and Efficient Transfer Learning of Deep Models via Neural Collapse
Li, Xiao, Liu, Sheng, Zhou, Jinxin, Lu, Xinyu, Fernandez-Granda, Carlos, Zhu, Zhihui, Qu, Qing
–arXiv.org Artificial Intelligence
As model size continues to grow and access to labeled training data remains limited, transfer learning has become a popular approach in many scientific and engineering fields. This study explores the phenomenon of neural collapse (NC) in transfer learning for classification problems, which is characterized by the last-layer features and classifiers of deep networks having zero within-class variability in features and maximally and equally separated between-class feature means. Through the lens of NC, in this work the following findings on transfer learning are discovered: (i) preventing within-class variability collapse to a certain extent during model pre-training on source data leads to better transferability, as it preserves the intrinsic structures of the input data better; (ii) obtaining features with more NC on downstream data during fine-tuning results in better test accuracy. These results provide new insight into commonly used heuristics in model pre-training, such as loss design, data augmentation, and projection heads, and lead to more efficient and principled methods for fine-tuning large pre-trained models. Compared to full model fine-tuning, our proposed fine-tuning methods achieve comparable or even better performance while reducing fine-tuning parameters by at least 70% as well as alleviating overfitting.
arXiv.org Artificial Intelligence
Feb-26-2023
- Country:
- North America > United States (0.67)
- Genre:
- Overview (1.00)
- Research Report > New Finding (1.00)
- Industry:
- Health & Medicine (0.92)
- Technology: