Towards Estimating Transferability using Hard Subsets
Menta, Tarun Ram, Jandial, Surgan, Patil, Akash, KB, Vimal, Bachu, Saketh, Krishnamurthy, Balaji, Balasubramanian, Vineeth N., Agarwal, Chirag, Sarkar, Mausoom
–arXiv.org Artificial Intelligence
As transfer learning techniques are increasingly used to transfer knowledge from the source model to the target task, it becomes important to quantify which source models are suitable for a given target task without performing computationally expensive fine-tuning. By leveraging the model's internal and output representations, we introduce two techniques - one class-agnostic and another class-specific - to identify harder subsets and show that H Transfer learning (Pan & Yang, 2009; Torrey & Shavlik, 2010; Weiss et al., 2016) aims to improve the performance of models on target tasks by utilizing the knowledge from source tasks. With the increasing development of large-scale pre-trained models (Devlin et al., 2019; Chen et al., 2020a;b; Radford et al., 2021b), and the availability of multiple model choices (e.g model hubs of Pytorch, Tensorflow, Hugging Face) for transfer learning, it is critical to estimate their transferability without training on the target task and determine how effectively transfer learning algorithms will transfer knowledge from the source to the target task. To this end, transferability estimation metrics (Zamir et al., 2018b; Achille et al., 2019; Tran et al., 2019b; Pándy et al., 2022; Nguyen et al., 2020) have been recently proposed to quantify how easy it is to use the knowledge learned from these models with minimal to no additional training using the target dataset. Given multiple pre-trained source models and target datasets, estimating transferability is essential because it is non-trivial to determine which source model transfers best to a target dataset, and that training multiple models using all source-target combinations can be computationally expensive. Recent years have seen a few different approaches (Zamir et al., 2018b; Achille et al., 2019; Tran et al., 2019b; Pándy et al., 2022; Nguyen et al., 2020) for estimating a given transfer learning task from a source model. However, existing such methods often require performing the transfer learning task for parameter optimization (Achille et al., 2019; Zamir et al., 2018b) or making strong assumptions on the source and target datasets (Tran et al., 2019b; Zamir et al., 2018b).
arXiv.org Artificial Intelligence
Jan-17-2023
- Country:
- North America > United States
- Oregon > Multnomah County
- Portland (0.04)
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Colorado > El Paso County
- Colorado Springs (0.04)
- California > San Diego County
- San Diego (0.04)
- Oregon > Multnomah County
- Europe > Belgium
- Brussels-Capital Region > Brussels (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.46)
- Technology: