Transfer Learning
Transfusion: Understanding Transfer Learning for Medical Imaging
Raghu, Maithra, Zhang, Chiyuan, Kleinberg, Jon, Bengio, Samy
Transfer learning from natural image datasets, particularly ImageNet, using standard large models and corresponding pretrained weights has become a de-facto method for deep learning applications to medical imaging. However, there are fundamental differences in data sizes, features and task specifications between natural image classification and the target medical tasks, and there is little understanding of the effects of transfer. In this paper, we explore properties of transfer learning for medical imaging. A performance evaluation on two large scale medical imaging tasks shows that surprisingly, transfer offers little benefit to performance, and simple, lightweight models can perform comparably to ImageNet architectures. Investigating the learned representations and features, we find that some of the differences from transfer learning are due to the over-parametrization of standard models rather than sophisticated feature reuse.
Catastrophic Forgetting Meets Negative Transfer: Batch Spectral Shrinkage for Safe Transfer Learning
Chen, Xinyang, Wang, Sinan, Fu, Bo, Long, Mingsheng, Wang, Jianmin
Before sufficient training data is available, fine-tuning neural networks pre-trained on large-scale datasets substantially outperforms training from random initialization. However, fine-tuning methods suffer from two dilemmas, catastrophic forgetting and negative transfer. While several methods with explicit attempts to overcome catastrophic forgetting have been proposed, negative transfer is rarely delved into. In this paper, we launch an in-depth empirical investigation into negative transfer in fine-tuning and find that, for the weight parameters and feature representations, transferability of their spectral components is diverse. For safe transfer learning, we present Batch Spectral Shrinkage (BSS), a novel regularization approach to penalizing smaller singular values so that untransferable spectral components are suppressed.
Customized Video QoE Estimation with Algorithm-Agnostic Transfer Learning
Ickin, Selim, Fiedler, Markus, Vandikas, Konstantinos
The development of QoE models by means of Machine Learning (ML) is challenging, amongst others due to small-size datasets, lack of diversity in user profiles in the source domain, and too much diversity in the target domains of QoE models. Furthermore, datasets can be hard to share between research entities, as the machine learning models and the collected user data from the user studies may be IPR- or GDPR-sensitive. This makes a decentralized learning-based framework appealing for sharing and aggregating learned knowledge in-between the local models that map the obtained metrics to the user QoE, such as Mean Opinion Scores (MOS). In this paper, we present a transfer learning-based ML model training approach, which allows decentralized local models to share generic indicators on MOS to learn a generic base model, and then customize the generic base model further using additional features that are unique to those specific localized (and potentially sensitive) QoE nodes. We show that the proposed approach is agnostic to specific ML algorithms, stacked upon each other, as it does not necessitate the collaborating localized nodes to run the same ML algorithm. Our reproducible results reveal the advantages of stacking various generic and specific models with corresponding weight factors. Moreover, we identify the optimal combination of algorithms and weight factors for the corresponding localized QoE nodes.
#012 TF Transfer Learning in TensorFlow 2.0 Master Data Science
Highlights: In this post we are going to show how to build a computer vision model without building it from scratch. The idea behind transfer learning is that a neural network that has been trained on a large dataset can apply its knowledge to a dataset that it has never seen before. That is, why it's called a transfer learning; we transfer the learning of an existing model to a new dataset. Previously we have explored how to improve the models performance using a data augmentation. The question now is, "what if we don't have enough data to train our network from scratch?".
Environment-agnostic Multitask Learning for Natural Language Grounded Navigation
Wang, Xin, Jain, Vihan, Ie, Eugene, Wang, William Yang, Kozareva, Zornitsa, Ravi, Sujith
Recent research efforts enable study for natural language grounded navigation in photo-realistic environments, e.g., following natural language instructions or dialog. However, existing methods tend to overfit training data in seen environments and fail to generalize well in previously unseen environments. In order to close the gap between seen and unseen environments, we aim at learning a generalized navigation model from two novel perspectives: (1) we introduce a multitask navigation model that can be seamlessly trained on both Vision-Language Navigation (VLN) and Navigation from Dialog History (NDH) tasks, which benefits from richer natural language guidance and effectively transfers knowledge across tasks; (2) we propose to learn environment-agnostic representations for the navigation policy that are invariant among the environments seen during training, thus generalizing better on unseen environments. Extensive experiments show that training with environmentagnostic multitask learning objective significantly reduces the performance gap between seen and unseen environments and the navigation agent so trained outperforms the baselines on unseen environments by 16% (relative measure on success rate) on VLN and 120% (goal progress) on NDH. Our submission to the CVDN leaderboard establishes a new state-of-the-art for the NDH task outperforming the existing best model by more than 66% (goal progress) on the holdout test set.
The Utility of Feature Reuse: Transfer Learning in Data-Starved Regimes
Verenich, Edward, Velasquez, Alvaro, Murshed, M. G. Sarwar, Hussain, Faraz
The use of transfer learning with deep neural networks has increasingly become widespread for deploying well-tested computer vision systems to newer domains, especially those with limited datasets. We describe a transfer learning use case for a domain with a data-starved regime, having fewer than 100 labeled target samples. We evaluate the effectiveness of convolutional feature extraction and fine-tuning of overparameterized models with respect to the size of target training data, as well as their generalization performance on data with covariate shift, or out-of-distribution (OOD) data. Our experiments show that both overparameterization and feature reuse contribute to successful application of transfer learning in training image classifiers in data-starved regimes.
Theoretical Models of Learning to Learn
A Machine can only learn if it is biased in some way. Typically the bias is supplied by hand, for example through the choice of an appropriate set of features. However, if the learning machine is embedded within an {\em environment} of related tasks, then it can {\em learn} its own bias by learning sufficiently many tasks from the environment. In this paper two models of bias learning (or equivalently, learning to learn) are introduced and the main theoretical results presented. The first model is a PAC-type model based on empirical process theory, while the second is a hierarchical Bayes model.
Adapted tree boosting for Transfer Learning
Fang, Wenjing, Chen, Chaochao, Song, Bowen, Wang, Li, Zhou, Jun, Zhu, Kenny Q.
Secure online transaction is an essential task for e-commerce platforms. Alipay, one of the world's leading cashless payment platform, provides the payment service to both merchants and individual customers. The fraud detection models are built to protect the customers, but stronger demands are raised by the new scenes, which are lacking in training data and labels. The proposed model makes a difference by utilizing the data under similar old scenes and the data under a new scene is treated as the target domain to be promoted. Inspired by this real case in Alipay, we view the problem as a transfer learning problem and design a set of revise strategies to transfer the source domain models to the target domain under the framework of gradient boosting tree models. This work provides an option for the cold-starting and data-sharing problems.
Using Transfer Learning to Overcome the Barriers Facing Machine Learning in Materials Science - News
Machine learning's ability to perform intellectually demanding tasks across various fields, materials science included, has caused it to receive considerable attention. Many believe that it could be used to unlock major time and cost savings in the development of new materials. The growing demand for the use of machine learning to derive fast-to-evaluate surrogate models of material properties has prompted scientists at the National Institute for Materials Science in Tsukuba, Japan, to demonstrate that it could be the key driver of the "next frontier" of materials science in recently published research. To learn, machines rely on processing data using both supervised and unsupervised learning. With no data, however, there is nothing to learn from.
Gaussian Process Models for Link Analysis and Transfer Learning
In this paper we develop a Gaussian process (GP) framework to model a collection of reciprocal random variables defined on the \emph{edges} of a network. We show how to construct GP priors, i.e., covariance functions, on the edges of directed, undirected, and bipartite graphs. The model suggests an intimate connection between \emph{link prediction} and \emph{transfer learning}, which were traditionally considered two separate research topics. Though a straightforward GP inference has a very high complexity, we develop an efficient learning algorithm that can handle a large number of observations. The experimental results on several real-world data sets verify superior learning capacity.