Transfer Learning
Homomorphisms Between Transfer, Multi-Task, and Meta-Learning Systems
Transfer learning, multi-task learning, and meta-learning are well-studied topics concerned with the generalization of knowledge across learning tasks and are closely related to general intelligence. But, the formal, general systems differences between them are underexplored in the literature. This lack of systems-level formalism leads to difficulties in coordinating related, inter-disciplinary engineering efforts. This manuscript formalizes transfer learning, multi-task learning, and meta-learning as abstract learning systems, consistent with the formal-minimalist abstract systems theory of Mesarovic and Takahara. Moreover, it uses the presented formalism to relate the three concepts of learning in terms of composition, hierarchy, and structural homomorphism. Findings are readily depicted in terms of input-output systems, highlighting the ease of delineating formal, general systems differences between transfer, multi-task, and meta-learning.
Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Huang, Wei-Ping, Chen, Po-Chun, Huang, Sung-Feng, Lee, Hung-yi
This paper studies a transferable phoneme embedding framework that aims to deal with the cross-lingual text-to-speech (TTS) problem under the few-shot setting. Transfer learning is a common approach when it comes to few-shot learning since training from scratch on few-shot training data is bound to overfit. Still, we find that the naive transfer learning approach fails to adapt to unseen languages under extremely few-shot settings, where less than 8 minutes of data is provided. We deal with the problem by proposing a framework that consists of a phoneme-based TTS model and a codebook module to project phonemes from different languages into a learned latent space. Furthermore, by utilizing phoneme-level averaged self-supervised learned features, we effectively improve the quality of synthesized speeches. Experiments show that using 4 utterances, which is about 30 seconds of data, is enough to synthesize intelligible speech when adapting to an unseen language using our framework.
The Curse of Low Task Diversity: On the Failure of Transfer Learning to Outperform MAML and Their Empirical Equivalence
Miranda, Brando, Yu, Patrick, Wang, Yu-Xiong, Koyejo, Sanmi
Recently, it has been observed that a transfer learning solution might be all we need to solve many few-shot learning benchmarks -- thus raising important questions about when and how meta-learning algorithms should be deployed. In this paper, we seek to clarify these questions by 1. proposing a novel metric -- the diversity coefficient -- to measure the diversity of tasks in a few-shot learning benchmark and 2. by comparing Model-Agnostic Meta-Learning (MAML) and transfer learning under fair conditions (same architecture, same optimizer, and all models trained to convergence). Using the diversity coefficient, we show that the popular MiniImageNet and CIFAR-FS few-shot learning benchmarks have low diversity. This novel insight contextualizes claims that transfer learning solutions are better than meta-learned solutions in the regime of low diversity under a fair comparison. Specifically, we empirically find that a low diversity coefficient correlates with a high similarity between transfer learning and MAML learned solutions in terms of accuracy at meta-test time and classification layer similarity (using feature based distance metrics like SVCCA, PWCCA, CKA, and OPD). To further support our claim, we find this meta-test accuracy holds even as the model size changes. Therefore, we conclude that in the low diversity regime, MAML and transfer learning have equivalent meta-test performance when both are compared fairly. We also hope our work inspires more thoughtful constructions and quantitative evaluations of meta-learning benchmarks in the future.
RadImageNet: An Open Radiologic Deep Learning Research Dataset for Effective Transfer Learning
"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. To demonstrate the value of pretraining with millions of radiologic images compared with ImageNet photographic images on downstream medical applications when using transfer learning. This retrospective study included patients who had a radiologic study between 2005 and 2020 at an outpatient imaging facility.
Incremental Learning Meets Transfer Learning: Application to Multi-site Prostate MRI Segmentation
You, Chenyu, Xiang, Jinlin, Su, Kun, Zhang, Xiaoran, Dong, Siyuan, Onofrey, John, Staib, Lawrence, Duncan, James S.
Many medical datasets have recently been created for medical image segmentation tasks, and it is natural to question whether we can use them to sequentially train a single model that (1) performs better on all these datasets, and (2) generalizes well and transfers better to the unknown target site domain. Prior works have achieved this goal by jointly training one model on multi-site datasets, which achieve competitive performance on average but such methods rely on the assumption about the availability of all training data, thus limiting its effectiveness in practical deployment. In this paper, we propose a novel multi-site segmentation framework called incremental-transfer learning (ITL), which learns a model from multi-site datasets in an end-to-end sequential fashion. Specifically, "incremental" refers to training sequentially constructed datasets, and "transfer" is achieved by leveraging useful information from the linear combination of embedding features on each dataset. In addition, we introduce our ITL framework, where we train the network including a site-agnostic encoder with pre-trained weights and at most two segmentation decoder heads. We also design a novel site-level incremental loss in order to generalize well on the target domain. Second, we show for the first time that leveraging our ITL training scheme is able to alleviate challenging catastrophic forgetting problems in incremental learning. We conduct experiments using five challenging benchmark datasets to validate the effectiveness of our incremental-transfer learning approach. Our approach makes minimal assumptions on computation resources and domain-specific expertise, and hence constitutes a strong starting point in multi-site medical image segmentation.
Transfer Learning for Segmentation Problems: Choose the Right Encoder and Skip the Decoder
Dippel, Jonas, Lenga, Matthias, Goerttler, Thomas, Obermayer, Klaus, Höhne, Johannes
It is common practice to reuse models initially trained on different data to increase downstream task performance. Especially in the computer vision domain, ImageNet-pretrained weights have been successfully used for various tasks. In this work, we investigate the impact of transfer learning for segmentation problems, being pixel-wise classification problems that can be tackled with encoder-decoder architectures. We find that transfer learning the decoder does not help downstream segmentation tasks, while transfer learning the encoder is truly beneficial. We demonstrate that pretrained weights for a decoder may yield faster convergence, but they do not improve the overall model performance as one can obtain equivalent results with randomly initialized decoders. However, we show that it is more effective to reuse encoder weights trained on a segmentation or reconstruction task than reusing encoder weights trained on classification tasks. This finding implicates that using ImageNet-pretrained encoders for downstream segmentation problems is suboptimal. We also propose a contrastive self-supervised approach with multiple self-reconstruction tasks, which provides encoders that are suitable for transfer learning in segmentation problems in the absence of segmentation labels.
Transfer Learning Implementation using Keras
What if I told you that a network that classifies 10 different types of vehicles can provide useful knowledge for a classification problem with 3 different types of cars? This is called transfer learning – a method that uses pre-trained neural networks to solve a new, similar problem. Over the years, people have been trying to produce different methods to train neural networks with small amounts of data. Those methods are used to generate more data for training. However, transfer learning provides an alternative by learning from existing architectures (trained on large datasets) and further training them for our new problem.
Easy Guide to Statistical analysis & Data Science Analytics
This online training provides a comprehensive list of analytical skills designed for students and researchers interested to learn applied statistics and data science to tackle common and complex real world research problems. This training covers end-to-end guide from basic statistics such as Chi-square test and multi-factorial ANOVA, to multivariate statistics such as Structural equation modeling and Multilevel modeling. Similarly, you will also learn powerful unsupervised machine learning techniques such as Apriori algorithm and tSNE, to more complex supervised machine learning such as Deep Learning and Transfer Learning. Whether you are a beginner or advanced researcher, we believe there is something for you! This workshop helps you better understand complex constructs by demystifying data science and statistical concepts and techniques for you.
What is Transfer Learning? - Medi-AI
The reuse of a pre-trained model on a new problem is known as transfer learning in machine learning. A machine uses the knowledge learned from a prior assignment to increase prediction about a new task in transfer learning.The general idea of transfer learning is to use knowledge learned from tasks for which a lot of labeled data is available in settings where only a little labeled data is available.
Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning
Evci, Utku, Dumoulin, Vincent, Larochelle, Hugo, Mozer, Michael C.
Transfer-learning methods aim to improve performance in a data-scarce target domain using a model pretrained on a data-rich source domain. A cost-efficient strategy, linear probing, involves freezing the source model and training a new classification head for the target domain. This strategy is outperformed by a more costly but state-of-the-art method -- fine-tuning all parameters of the source model to the target domain -- possibly because fine-tuning allows the model to leverage useful information from intermediate layers which is otherwise discarded by the later pretrained layers. We explore the hypothesis that these intermediate layers might be directly exploited. We propose a method, Head-to-Toe probing (Head2Toe), that selects features from all layers of the source model to train a classification head for the target-domain. In evaluations on the VTAB-1k, Head2Toe matches performance obtained with fine-tuning on average while reducing training and storage cost hundred folds or more, but critically, for out-of-distribution transfer, Head2Toe outperforms fine-tuning.