TwinTURBO: Semi-Supervised Fine-Tuning of Foundation Models via Mutual Information Decompositions for Downstream Task and Latent Spaces

Quétant, Guillaume, Molchanov, Pavlo, Voloshynovskiy, Slava

arXiv.org Machine Learning 

Foundation models are large-scale neural networks pre-trained on diverse data to learn generalpurpose representations that can be fine-tuned for specific downstream tasks. This poses significant challenges, especially in the case of low-labelled data, a semi-supervised learning setting where only a small fraction of the data samples are labelled, while the majority remain unlabelled. While foundation models are pre-trained on large datasets in a self-supervised manner, their deployment often requires fine-tuning on new datasets with limited labelled samples and potential distribution shifts. Furthermore, the downstream tasks frequently differ from the pre-training objectives, complicating the adaptation process. Existing semi-supervised approaches, such as pseudo-labelling, rely heavily on assumptions about data distributions or task-specific tuning, limiting their generalisability. Addressing these challenges is essential to fully exploit the potential of foundation models and ensure their adaptability and scalability in diverse applications. The main contributions of this study are: A new framework for foundation models fine-tuning: We introduces a fine-tuning strategy based on mutual information decomposition.