Goto

Collaborating Authors

 Transfer Learning


Transfer Learning in Visual and Relational Reasoning

#artificialintelligence

Similar developments have emerged in the Natural Language Processing (NLP) community. The success of transfer learning raises several research questions, such as the characteristics which make a dataset more favorable to be used in pretraining (notably ImageNet [huh2016makes]), or regarding the observed performance correlation of models with different architectures between the source and target domains [kornblith2019better]. One of the most systematic works in this area is the computational taxonomic map for task transfer learning [zamir2018taskonomy], which aimed at discovering the dependencies between twenty-six 2D, 2.5D, 3D, and semantic computer vision tasks. In this work we focus on transfer learning in multi-modal tasks combining vision and language [mogadala2019trends]. More precisely, we narrow the scope to transfer learning between visual reasoning tasks that have a "nice" logical structure, e.g., [johnson2017clevr, yang2018dataset, song2018explore].


Data Augmentation for Deep Transfer Learning

arXiv.org Machine Learning

Current approaches to deep learning are beginning to rely heavily on transfer learning as an effective method for reducing overfitting, improving model performance, and quickly learning new tasks. Similarly, such pre-trained models are often used to create embedding representations for various types of data, such as text and images, which can then be fed as input into separate, downstream models. However, in cases where such transfer learning models perform poorly (i.e., for data outside of the training distribution), one must resort to fine-tuning such models, or even retraining them completely. Currently, no form of data augmentation has been proposed that can be applied directly to embedding inputs to improve downstream model performance. In this work, we introduce four new types of data augmentation that are generally applicable to embedding inputs, thus making them useful in both Natural Language Processing (NLP) and Computer Vision (CV) applications. For models trained on downstream tasks with such embedding inputs, these augmentation methods are shown to improve the AUC score of the models from a score of 0.9582 to 0.9812 and significantly increase the model's ability to identify classes of data that are not seen during training.


Transfer Learning in Visual and Relational Reasoning

arXiv.org Artificial Intelligence

Transfer learning is becoming the de facto solution for vision and text encoders in the front-end processing of machine learning solutions. Utilizing vast amounts of knowledge in pre-trained models and subsequent fine-tuning allows achieving better performance in domains where labeled data is limited. In this paper, we analyze the efficiency of transfer learning in visual reasoning by introducing a new model (SAMNet) and testing it on two datasets: COG and CLEVR. Our new model achieves state-of-the-art accuracy on COG and shows significantly better generalization capabilities compared to the baseline. We also formalize a taxonomy of transfer learning for visual reasoning around three axes: feature, temporal, and reasoning transfer. Based on extensive experimentation of transfer learning on each of the two datasets, we show the performance of the new model along each axis.


ARUBA: Learning-to-Learn with Less Regret

#artificialintelligence

Figure 1: Illustration of the meta-learning process as applied to the task of personalized next-word prediction. Here each mobile device corresponds to a different next-word prediction task, with the test-task not seen during meta-training (Step 1). In the classical machine learning setup, we aim to learn a single model for a single task given many training samples from the same distribution. However, in many practical applications, we are in fact exposed to several distinct yet related tasks that have only a few examples each. Because the data now come from different training distributions, simply learning a single global model, e.g., via stochastic gradient descent (SGD), may result in poor performance on each task.


Research into machine-learning specialty finds new home at USC Viterbi

#artificialintelligence

With a new $1.5 million grant, the growing field of transfer learning has come to the Ming Hsieh Department of Electrical and Computer Engineering at the USC Viterbi School of Engineering. The grant was awarded to three professors -- Salman Avestimehr, Antonio Ortega and Mahdi Soltanolkotabi -- who will work with Ilias Diakonikolas at the University of Wisconsin, Madison, to address the theoretical foundations of this field. Modern machine learning models are breaking new ground in data science, achieving unprecedented performance on tasks like classifying images in one thousand different image categories. This is achieved by training gigantic neural networks. "Neural networks work really well because they can be trained on huge amounts of pre-existing data that has previously been tagged and collected," said Avestimehr, the primary investigator of the project.


Nasdaq To Expand Use of AI With Transfer Learning

#artificialintelligence

The technology took just over one year to develop in a collaboration between Nasdaq's market technology business, its machine intelligence lab in …


Transfer Learning Toolkit: Primers and Benchmarks

arXiv.org Machine Learning

The transfer learning toolkit wraps the codes of 17 transfer learn ing models and provides integrated interfaces, allowing users to use those models by calling a simple function. It is easy for primary researchers to use this toolkit and to choose proper models for real-world applica tions. The toolkit is written in Python and distributed under MIT open source license. In this pape r, the current state of this toolkit is described and the necessary environment setting and usage are in troduced. Keywords: Transfer Learning, Toolkit 1. Introduction Transfer learning is a promising and important direction in machine lear ning, which attempts to leverage the knowledge contained in a source domain to improve the le arning performance or minimize the number of labeled samples required in a target domain.


Transfer Learning with TensorFlow 2 – Model Fine Tuning

#artificialintelligence

In the previous article, we had a chance to explore transfer learning with TensorFlow 2. We used several huge pre-trained models: VGG16, GoogLeNet and ResNet. These architectures are all trained on ImageNet dataset and their weights are stored. We specialized them for "Cats vs Dogs" dataset, the dataset that contains 23,262 images of cats and dogs. There are many pre-trained models available at tensorflow.keras.applications In essence, there are two ways in which you can use them.


Towards Making Deep Transfer Learning Never Hurt

arXiv.org Machine Learning

Transfer learning have been frequently used to improve deep neural network training through incorporating weights of pre-trained networks as the starting-point of optimization for regularization. While deep transfer learning can usually boost the performance with better accuracy and faster convergence, transferring weights from inappropriate networks hurts training procedure and may lead to even lower accuracy. In this paper, we consider deep transfer learning as minimizing a linear combination of empirical loss and regularizer based on pre-trained weights, where the regularizer would restrict the training procedure from lowering the empirical loss, with conflicted descent directions (e.g., derivatives). Following the view, we propose a novel strategy making regularization-based Deep Transfer learning Never Hurt (DTNH) that, for each iteration of training procedure, computes the derivatives of the two terms separately, then re-estimates a new descent direction that does not hurt the empirical loss minimization while preserving the regularization affects from the pre-trained weights. Extensive experiments have been done using common transfer learning regularizers, such as L2-SP and knowledge distillation, on top of a wide range of deep transfer learning benchmarks including Caltech, MIT indoor 67, CIFAR-10 and ImageNet. The empirical results show that the proposed descent direction estimation strategy DTNH can always improve the performance of deep transfer learning tasks based on all above regularizers, even when transferring pre-trained weights from inappropriate networks. All in all, DTNH strategy can improve state-of-the-art regularizers in all cases with 0.1%--7% higher accuracy in all experiments.


Transfer Learning for Dog Breed classifier

#artificialintelligence

Dogs are man's best friend and they deserve to be identified correctly. In pursuit of differentiating a Husky (Go Dawgs!) from an Alaskan Malamute, let's learn how to use transfer learning to classify dog breeds. Find the entire Jupyter Notebook on my GitHub. NOTE: This project/article is based off of Udacity's skeleton Dog Breed Classifier project as part of the AIND program with certain modifications. As always with most of my technical posts, we need to make sure we have the data we want to work with.