Guan, Yunchuan
Learning to Learn Weight Generation via Trajectory Diffusion
Guan, Yunchuan, Liu, Yu, Zhou, Ke, Shen, Zhiqi, Belongie, Serge, Hwang, Jenq-Neng, Li, Lei
Diffusion-based algorithms have emerged as promising techniques for weight generation, particularly in scenarios like multi-task learning that require frequent weight updates. However, existing solutions suffer from limited cross-task transferability. In addition, they only utilize optimal weights as training samples, ignoring the value of other weights in the optimization process. To address these issues, we propose Lt-Di, which integrates the diffusion algorithm with meta-learning to generate weights for unseen tasks. Furthermore, we extend the vanilla diffusion algorithm into a trajectory diffusion algorithm to utilize other weights along the optimization trajectory. Trajectory diffusion decomposes the entire diffusion chain into multiple shorter ones, improving training and inference efficiency. We analyze the convergence properties of the weight generation paradigm and improve convergence efficiency without additional time overhead. Our experiments demonstrate Lt-Di's higher accuracy while reducing computational overhead across various tasks, including zero-shot and few-shot learning, multi-domain generalization, and large-scale language model fine-tuning.Our code is released at https://github.com/tuantuange/Lt-Di.
Unsupervised Meta-Learning via Dynamic Head and Heterogeneous Task Construction for Few-Shot Classification
Guan, Yunchuan, Liu, Yu, Liu, Ketong, Zhou, Ke, Shen, Zhiqi
However, the questions of why and when it is better than other algorithms in few-shot classification remain to be explored. In this paper, we perform pre-experiments by adjusting the proportion of label noise and the degree of task heterogeneity in the dataset. We use the metric of Singular Vector Canonical Correlation Analysis to quantify the representation stability of the neural network and thus to compare the behavior of meta-learning and classical learning algorithms. We find that benefiting from the bi-level optimization strategy, the meta-learning algorithm has better robustness to label noise and heterogeneous tasks. Based on the above conclusion, we argue a promising future for meta-learning in the unsupervised area, and thus propose DHM-UHT, a dynamic head meta-learning algorithm with unsupervised heterogeneous task construction. The core idea of DHM-UHT is to use DBSCAN and dynamic head to achieve heterogeneous task construction and meta-learn the whole process of unsupervised heterogeneous task construction. As an example, the optimization-based meta-learning algorithm Finn et al. (2017); Raghu et al. (2020); Nichol et al. (2018) has been shown to demonstrate excellent generalization performance in few-shot learning and reinforcement learning. In these areas, the more commonly used pre-train and fine-tune strategy exhibits disadvantages regarding training overhead, reliance on massive samples, and accuracy. However, in recent years, new research has shown that models pre-trained by the classical Whole-Class Training (WCT) strategy exhibit comparable or even better accuracy on multiple few-shot image classification datasets Tian et al. (2020); Chen et al. (2021). The inconsistent conclusions described above confuse us about the nature of meta-learning, and in turn hinder us from developing the area.