AITopics | Transfer Learning

Collaborating Authors

Transfer Learning

Transfer Learning is the reuse of a pre-trained model on a new problem. (Towards Data Science)

News Overviews Instructional Materials AI-Alerts Classics

Heterogeneous multitask learning with joint sparsity constraints

Yang, Xiaolin, Kim, Seyoung, Xing, Eric P.

Neural Information Processing SystemsFeb-15-2020, 04:11:59 GMT

Multitask learning addressed the problem of learning related tasks whose information can be shared each other. In this paper we consider the problem learning multiple related tasks where tasks consist of both continuous and discrete outputs from a common set of input variables that lie in a high-dimensional space. All of the tasks are related in the sense that they share the same set of relevant input variables, but the amount of influence of each input on different outputs may vary. We formulate this problem as a combination of linear regression and logistic regression and model the joint sparsity as L1/Linf and L1/L2-norm of the model parameters. Among several possible applications, our approach addresses an important open problem in genetic association mapping, where we are interested in discovering genetic markers that influence multiple correlated traits jointly.

association mapping, heterogeneous multitask, joint sparsity constraint, (1 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.43)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.63)

Add feedback

Multitask Learning without Label Correspondences

Quadrianto, Novi, Petterson, James, Caetano, Tibério S., Smola, Alex J., Vishwanathan, S.v.n.

Neural Information Processing SystemsFeb-15-2020, 03:11:11 GMT

We propose an algorithm to perform multitask learning where each task has potentially distinct label sets and label correspondences are not readily available. This is in contrast with existing methods which either assume that the label sets shared by different tasks are the same or that there exists a label mapping oracle. Our method directly maximizes the mutual information among the labels, and we show that the resulting objective function can be efficiently optimized using existing algorithms. Our proposed approach has a direct application for data integration with different label spaces for the purpose of classification, such as integrating Yahoo! and DMOZ web directories. Papers published at the Neural Information Processing Systems Conference.

algorithm, label correspondence, multitask learning

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.68)

Add feedback

Transfer Learning by Distribution Matching for Targeted Advertising

Bickel, Steffen, Sawade, Christoph, Scheffer, Tobias

Neural Information Processing SystemsFeb-15-2020, 01:12:07 GMT

We address the problem of learning classifiers for several related tasks that may differ in their joint distribution of input and output variables. For each task, small - possibly even empty - labeled samples and large unlabeled samples are available. While the unlabeled samples reflect the target distribution, the labeled samples may be biased. We derive a solution that produces resampling weights which match the pool of all examples to the target distribution of any given task. Our work is motivated by the problem of predicting sociodemographic features for users of web portals, based on the content which they have accessed.

distribution matching, targeted advertising, transfer learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.46)

Add feedback

Learning To Learn Around A Common Mean

Denevi, Giulia, Ciliberto, Carlo, Stamos, Dimitris, Pontil, Massimiliano

Neural Information Processing SystemsFeb-14-2020, 20:58:08 GMT

The problem of learning-to-learn (LTL) or meta-learning is gaining increasing attention due to recent empirical evidence of its effectiveness in applications. The goal addressed in LTL is to select an algorithm that works well on tasks sampled from a meta-distribution. In this work, we consider the family of algorithms given by a variant of Ridge Regression, in which the regularizer is the square distance to an unknown mean vector. We show that, in this setting, the LTL problem can be reformulated as a Least Squares (LS) problem and we exploit a novel meta- algorithm to efficiently solve it. At each iteration the meta-algorithm processes only one dataset.

algorithm, dataset, learning

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.64)

Add feedback

Hardware Conditioned Policies for Multi-Robot Transfer Learning

Chen, Tao, Murali, Adithyavairavan, Gupta, Abhinav

Neural Information Processing SystemsFeb-14-2020, 20:42:25 GMT

Deep reinforcement learning could be used to learn dexterous robotic policies but it is challenging to transfer them to new robots with vastly different hardware properties. It is also prohibitively expensive to learn a new policy from scratch for each robot hardware due to the high sample complexity of modern state-of-the-art algorithms. We propose a novel approach called Hardware Conditioned Policies where we train a universal policy conditioned on a vector representation of robot hardware. We considered robots in simulation with varied dynamics, kinematic structure, kinematic lengths and degrees-of-freedom. First, we use the kinematic structure directly as the hardware encoding and show great zero-shot transfer to completely novel robots not seen during training.

hardware conditioned policy, multi-robot transfer learning, robot hardware

Neural Information Processing Systems

Genre: Research Report (0.43)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.40)

Add feedback

Transfer Learning with Neural AutoML

Wong, Catherine, Houlsby, Neil, Lu, Yifeng, Gesmundo, Andrea

Neural Information Processing SystemsFeb-14-2020, 20:13:17 GMT

We reduce the computational cost of Neural AutoML with transfer learning. Neural AutoML has become popular for the design of deep learning architectures, however, this method has a high computation cost. To address this we propose Transfer Neural AutoML that uses knowledge from prior tasks to speed up network design. We extend RL-based architecture search methods to support parallel training on multiple tasks and then transfer the search strategy to new tasks. On language and image classification data, Transfer Neural AutoML reduces convergence time over single-task training by over an order of magnitude on many tasks.

neural automl, transfer learning

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Learning to Model the Tail

Wang, Yu-Xiong, Ramanan, Deva, Hebert, Martial

Neural Information Processing SystemsFeb-14-2020, 19:42:54 GMT

We describe an approach to learning from long-tailed, imbalanced datasets that are prevalent in real-world settings. Here, the challenge is to learn accurate "few-shot'' models for classes in the tail of the class distribution, for which little data is available. We cast this problem as transfer learning, where knowledge from the data-rich classes in the head of the distribution is transferred to the data-poor classes in the tail. Our key insights are as follows. First, we propose to transfer meta-knowledge about learning-to-learn from the head classes.

knowledge, learning, model parameter, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.61)

Add feedback

Scalable Hyperparameter Transfer Learning

Perrone, Valerio, Jenatton, Rodolphe, Seeger, Matthias W., Archambeau, Cedric

Neural Information Processing SystemsFeb-14-2020, 18:57:54 GMT

Bayesian optimization (BO) is a model-based approach for gradient-free black-box function optimization, such as hyperparameter optimization. Typically, BO relies on conventional Gaussian process (GP) regression, whose algorithmic complexity is cubic in the number of evaluations. As a result, GP-based BO cannot leverage large numbers of past function evaluations, for example, to warm-start related BO runs. We propose a multi-task adaptive Bayesian linear regression model for transfer learning in BO, whose complexity is linear in the function evaluations: one Bayesian linear regression model is associated to each black-box function optimization problem (or task), while transfer learning is achieved by coupling the models through a shared deep neural net. Experiments show that the neural net learns a representation suitable for warm-starting the black-box optimization problems and that BO runs can be accelerated when the target black-box function (e.g., validation loss) is learned together with other related signals (e.g., training loss).

Add feedback

Sparse Overlapping Sets Lasso for Multitask Learning and its Application to fMRI Analysis

Rao, Nikhil, Cox, Christopher, Nowak, Rob, Rogers, Timothy T.

Neural Information Processing SystemsFeb-14-2020, 18:10:38 GMT

Multitask learning can be effective when features useful in one task are also useful for other tasks, and the group lasso is a standard method for selecting a common subset of features. In this paper, we are interested in a less restrictive form of multitask learning, wherein (1) the available features can be organized into subsets according to a notion of similarity and (2) features useful in one task are similar, but not necessarily identical, to the features best suited for other tasks. The main contribution of this paper is a new procedure called {\em Sparse Overlapping Sets (SOS) lasso}, a convex optimization that automatically selects similar features for related learning tasks. Error bounds are derived for SOSlasso and its consistency is established for squared error loss. In particular, SOSlasso is motivated by multi-subject fMRI studies in which functional activity is classified using brain voxels as features.

fmri analysis, multitask learning, sparse overlapping set lasso, (4 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Health Care Technology (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.91)

Add feedback

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

Jia, Ye, Zhang, Yu, Weiss, Ron, Wang, Quan, Shen, Jonathan, Ren, Fei, Chen, zhifeng, Nguyen, Patrick, Pang, Ruoming, Moreno, Ignacio Lopez, Wu, Yonghui

Neural Information Processing SystemsFeb-14-2020, 14:57:18 GMT

We describe a neural network-based system for text-to-speech (TTS) synthesis that is able to generate speech audio in the voice of many different speakers, including those unseen during training. Our system consists of three independently trained components: (1) a speaker encoder network, trained on a speaker verification task using an independent dataset of noisy speech from thousands of speakers without transcripts, to generate a fixed-dimensional embedding vector from seconds of reference speech from a target speaker; (2) a sequence-to-sequence synthesis network based on Tacotron 2, which generates a mel spectrogram from text, conditioned on the speaker embedding; (3) an auto-regressive WaveNet-based vocoder that converts the mel spectrogram into a sequence of time domain waveform samples. We demonstrate that the proposed model is able to transfer the knowledge of speaker variability learned by the discriminatively-trained speaker encoder to the new task, and is able to synthesize natural speech from speakers that were not seen during training. We quantify the importance of training the speaker encoder on a large and diverse speaker set in order to obtain the best generalization performance. Finally, we show that randomly sampled speaker embeddings can be used to synthesize speech in the voice of novel speakers dissimilar from those used in training, indicating that the model has learned a high quality speaker representation.

multispeaker text-to-speech synthesis, speaker verification, transfer learning, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.74)

Add feedback