Transfer Learning is the reuse of a pre-trained model on a new problem. (Towards Data Science)
With a new $1.5 million grant, the growing field of transfer learning has come to the Ming Hsieh Department of Electrical and Computer Engineering at the USC Viterbi School of Engineering. The grant was awarded to three professors -- Salman Avestimehr, Antonio Ortega and Mahdi Soltanolkotabi -- who will work with Ilias Diakonikolas at the University of Wisconsin, Madison, to address the theoretical foundations of this field. Modern machine learning models are breaking new ground in data science, achieving unprecedented performance on tasks like classifying images in one thousand different image categories. This is achieved by training gigantic neural networks. "Neural networks work really well because they can be trained on huge amounts of pre-existing data that has previously been tagged and collected," said Avestimehr, the primary investigator of the project.
In the previous article, we had a chance to explore transfer learning with TensorFlow 2. We used several huge pre-trained models: VGG16, GoogLeNet and ResNet. These architectures are all trained on ImageNet dataset and their weights are stored. We specialized them for "Cats vs Dogs" dataset, the dataset that contains 23,262 images of cats and dogs. There are many pre-trained models available at tensorflow.keras.applications In essence, there are two ways in which you can use them.
Dogs are man's best friend and they deserve to be identified correctly. In pursuit of differentiating a Husky (Go Dawgs!) from an Alaskan Malamute, let's learn how to use transfer learning to classify dog breeds. Find the entire Jupyter Notebook on my GitHub. NOTE: This project/article is based off of Udacity's skeleton Dog Breed Classifier project as part of the AIND program with certain modifications. As always with most of my technical posts, we need to make sure we have the data we want to work with.
Abstract: We describe a neural network-based system for text-to-speech (TTS) synthesis that is able to generate speech audio in the voice of many different speakers, including those unseen during training. Our system consists of three independently trained components: (1) a speaker encoder network, trained on a speaker verification task using an independent dataset of noisy speech from thousands of speakers without transcripts, to generate a fixed-dimensional embedding vector from seconds of reference speech from a target speaker; (2) a sequence-to-sequence synthesis network based on Tacotron 2, which generates a mel spectrogram from text, conditioned on the speaker embedding; (3) an auto-regressive WaveNet-based vocoder that converts the mel spectrogram into a sequence of time domain waveform samples. We demonstrate that the proposed model is able to transfer the knowledge of speaker variability learned by the discriminatively-trained speaker encoder to the new task, and is able to synthesize natural speech from speakers that were not seen during training. We quantify the importance of training the speaker encoder on a large and diverse speaker set in order to obtain the best generalization performance. Finally, we show that randomly sampled speaker embeddings can be used to synthesize speech in the voice of novel speakers dissimilar from those used in training, indicating that the model has learned a high quality speaker representation.
Text classification has numerous applications, from tweet sentiment, product reviews, toxic comments, and more. It's a popular project topic among Insight Fellows, however a lot of time is spent collecting labeled datasets, cleaning data, and deciding which classification method to use. Services like Clarifai, and Google AutoML have made it very easy to create image classification models with less labeled data, but it's not as easy to create such models for text classification. For image classification tasks, transfer learning has proven to be very effective in providing good accuracy with fewer labeled datasets. Transfer learning is a technique that enables the transfer of knowledge learned from one dataset to another.
In this notebook we will be learning how to use Transfer Learning to create the powerful convolutional neural network with a very little effort, with the help of MobileNetV2 developed by Google that has been trained on large dataset of images. We will be using the pretrained model to train our dataset on the MobileNetV2 model. Note: When performing transfer learning we must always change the last layer of the pre-trained model so that it has the same number of classes that we have in the dataset we are working with. Check my Kaggle Notebook Link where you will get the understanding of Tranfer learning with the help of MobileNetV2.
This type of cross-lingual transfer learning can make it easier to bootstrap a model in a language for which training data is scarce, by taking advantage of more abundant data in a source language. But sometimes the data in the source language is so abundant that using all of it to train a transfer model would be impractically time consuming. Moreover, linguistic differences between source and target languages mean that pruning the training data in the source language, so that its statistical patterns better match those of the target language, can actually improve the performance of the transferred model. In a paper we're presenting at this year's Conference on Empirical Methods in Natural Language Processing, we describe experiments with a new data selection technique that let us halve the amount of training data required in the source language, while actually improving a transfer model's performance in a target language. For evaluation purposes, we used two techniques to cut the source-language data set in half: one was our data selection technique, and the other was random sampling.
The first thing to do in any machine learning task is to collect the data. What we need are thousands of images with labeled facial expressions. The public FER dataset  is a great starting point with 28,709 labeled images. However, since the resolution of these images is only 48 x 48, it would be nice to also have a dataset with richer features. The benefit of this approach is that (a) it retrieves thousands of images "from the wild" and (b) we can automatically label the images using the keywords in the query.
Cross-lingual learning is an AI technique involving training a natural language processing model in one language and retraining it in another. It's been demonstrated that retrained models can outperform those trained from scratch in the second language, which is likely why researchers at Amazon's Alexa division are investing considerable time investigating them. In a paper scheduled to be presented at this year's Conference on Empirical Methods in Natural Language Processing, two scientists at the Alexa AI natural understanding group -- Quynh Do and Judith Gaspers -- and colleagues propose a data selection technique that halves the amount of required training data. They claim that it surprisingly improves rather than compromises the model's overall performance in the target language. "Sometimes the data in the source language is so abundant that using all of it to train a transfer model would be impractically time consuming," wrote Do and Gaspers in a blog post.