Goto

Collaborating Authors

 Transfer Learning


Adversarially robust transfer learning

arXiv.org Machine Learning

Transfer learning, in which a network is trained on one task and re-purposed on another, is often used to produce neural network classifiers when data is scarce or full-scale training is too costly. When the goal is to produce a model that is not only accurate but also adversarially robust, data scarcity and computational limitations become even more cumbersome. We consider robust transfer learning, in which we transfer not only performance but also robustness from a source model to a target domain. We start by observing that robust networks contain robust feature extractors. By training classifiers on top of these feature extractors, we produce new models that inherit the robustness of their parent networks. We then consider the case of "fine tuning" a network by retraining end-to-end in the target domain. When using lifelong learning strategies, this process preserves the robustness of the source network while achieving high accuracy. By using such strategies, it is possible to produce accurate and robust models with little data, and without the cost of adversarial training.


Learning What and Where to Transfer

arXiv.org Machine Learning

As the application of deep learning has expanded to real-world problems with insufficient volume of training data, transfer learning recently has gained much attention as means of improving the performance in such small-data regime. However, when existing methods are applied between heterogeneous architectures and tasks, it becomes more important to manage their detailed configurations and often requires exhaustive tuning on them for the desired performance. To address the issue, we propose a novel transfer learning approach based on meta-learning that can automatically learn what knowledge to transfer from the source network to where in the target network. Given source and target networks, we propose an efficient training scheme to learn meta-networks that decide (a) which pairs of layers between the source and target networks should be matched for knowledge transfer and (b) which features and how much knowledge from each feature should be transferred. We validate our meta-transfer approach against recent transfer learning methods on various datasets and network architectures, on which our automated scheme significantly outperforms the prior baselines that find "what and where to transfer" in a hand-crafted manner.


SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

arXiv.org Artificial Intelligence

In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently come close to the level of non-expert humans, suggesting limited headroom for further research. This paper recaps lessons learned from the GLUE benchmark and presents SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, improved resources, and a new public leaderboard. SuperGLUE will be available soon at super.gluebenchmark.com.


Representation Similarity Analysis for Efficient Task taxonomy & Transfer Learning

arXiv.org Artificial Intelligence

Transfer learning is widely used in deep neural network models when there are few labeled examples available. The common approach is to take a pre-trained network in a similar task and finetune the model parameters. This is usually done blindly without a pre-selection from a set of pre-trained models, or by finetuning a set of models trained on different tasks and selecting the best performing one by cross-validation. We address this problem by proposing an approach to assess the relationship between visual tasks and their task-specific models. Our method uses Representation Similarity Analysis (RSA), which is commonly used to find a correlation between neuronal responses from brain data and models. With RSA we obtain a similarity score among tasks by computing correlations between models trained on different tasks. Our method is efficient as it requires only pre-trained models, and a few images with no further training. We demonstrate the effectiveness and efficiency of our method for generating task taxonomy on Taskonomy dataset. We next evaluate the relationship of RSA with the transfer learning performance on Taskonomy tasks and a new task: Pascal VOC semantic segmentation. Our results reveal that models trained on tasks with higher similarity score show higher transfer learning performance. Surprisingly, the best transfer learning result for Pascal VOC semantic segmentation is not obtained from the pre-trained model on semantic segmentation, probably due to the domain differences, and our method successfully selects the high performing models.


Attention-based Transfer Learning for Brain-computer Interface

arXiv.org Artificial Intelligence

Different functional areas of the human brain play different roles in brain activity, which has not been paid sufficient research attention in the brain-computer interface (BCI) field. This paper presents a new approach for electroencephalography (EEG) classification that applies attention-based transfer learning. Our approach considers the importance of different brain functional areas to improve the accuracy of EEG classification, and provides an additional way to automatically identify brain functional areas associated with new activities without the involvement of a medical professional. We demonstrate empirically that our approach out-performs state-of-the-art approaches in the task of EEG classification, and the results of visualization indicate that our approach can detect brain functional areas related to a certain task.


Deep Transfer Learning for Single-Channel Automatic Sleep Staging with Channel Mismatch

arXiv.org Machine Learning

Many sleep studies suffer from the problem of insufficient data to fully utilize deep neural networks as different labs use different recordings set ups, leading to the need of training automated algorithms on rather small databases, whereas large annotated databases are around but cannot be directly included into these studies for data compensation due to channel mismatch. This work presents a deep transfer learning approach to overcome the channel mismatch problem and transfer knowledge from a large dataset to a small cohort to study automatic sleep staging with single-channel input. We employ the state-of-the-art SeqSleepNet and train the network in the source domain, i.e. the large dataset. Afterwards, the pretrained network is finetuned in the target domain, i.e. the small cohort, to complete knowledge transfer. We study two transfer learning scenarios with slight and heavy channel mismatch between the source and target domains. We also investigate whether, and if so, how finetuning entirely or partially the pretrained network would affect the performance of sleep staging on the target domain. Using the Montreal Archive of Sleep Studies (MASS) database consisting of 200 subjects as the source domain and the Sleep-EDF Expanded database consisting of 20 subjects as the target domain in this study, our experimental results show significant performance improvement on sleep staging achieved with the proposed deep transfer learning approach. Furthermore, these results also reveal the essential of finetuning the feature-learning parts of the pretrained network to be able to bypass the channel mismatch problem.


Easy Transfer Learning By Exploiting Intra-domain Structures

arXiv.org Machine Learning

Transfer learning aims at transferring knowledge from a well-labeled domain to a similar but different domain with limited or no labels. Unfortunately, existing learning-based methods often involve intensive model selection and hyperparameter tuning to obtain good results. Moreover, cross-validation is not possible for tuning hyperparameters since there are often no labels in the target domain. This would restrict wide applicability of transfer learning especially in computationally-constraint devices such as wearables. In this paper, we propose a practically Easy Transfer Learning (EasyTL) approach which requires no model selection and hyperparameter tuning, while achieving competitive performance. By exploiting intra-domain structures, EasyTL is able to learn both non-parametric transfer features and classifiers. Extensive experiments demonstrate that, compared to state-of-the-art traditional and deep methods, EasyTL satisfies the Occam's Razor principle: it is extremely easy to implement and use while achieving comparable or better performance in classification accuracy and much better computational efficiency. Additionally, it is shown that EasyTL can increase the performance of existing transfer feature learning methods.


On Better Exploring and Exploiting Task Relationships in Multi-Task Learning: Joint Model and Feature Learning

arXiv.org Artificial Intelligence

Multitask learning (MTL) aims to learn multiple tasks simultaneously through the interdependence between different tasks. The way to measure the relatedness between tasks is always a popular issue. There are mainly two ways to measure relatedness between tasks: common parameters sharing and common features sharing across different tasks. However, these two types of relatedness are mainly learned independently, leading to a loss of information. In this paper, we propose a new strategy to measure the relatedness that jointly learns shared parameters and shared feature representations. The objective of our proposed method is to transform the features from different tasks into a common feature space in which the tasks are closely related and the shared parameters can be better optimized. We give a detailed introduction to our proposed multitask learning method. Additionally, an alternating algorithm is introduced to optimize the nonconvex objection. A theoretical bound is given to demonstrate that the relatedness between tasks can be better measured by our proposed multitask learning algorithm. We conduct various experiments to verify the superiority of the proposed joint model and feature a multitask learning method.


Characterizing and Avoiding Negative Transfer

arXiv.org Machine Learning

When labeled data is scarce for a specific target task, transfer learning often offers an effective solution by utilizing data from a related source task. However, when transferring knowledge from a less related source, it may inversely hurt the target performance, a phenomenon known as negative transfer. Despite its pervasiveness, negative transfer is usually described in an informal manner, lacking rigorous definition, careful analysis, or systematic treatment. This paper proposes a formal definition of negative transfer and analyzes three important aspects thereof. Stemming from this analysis, a novel technique is proposed to circumvent negative transfer by filtering out unrelated source data. Based on adversarial networks, the technique is highly generic and can be applied to a wide range of transfer learning algorithms. The proposed approach is evaluated on six state-of-the-art deep transfer methods via experiments on four benchmark datasets with varying levels of difficulty. Empirically, the proposed method consistently improves the performance of all baseline methods and largely avoids negative transfer, even when the source data is degenerate.


Cross-Subject Transfer Learning in Human Activity Recognition Systems using Generative Adversarial Networks

arXiv.org Machine Learning

Application of intelligent systems especially in smart homes and health-related topics has been drawing more attention in the last decades. Training Human Activity Recognition (HAR) models -- as a major module -- requires a fair amount of labeled data. Despite training with large datasets, most of the existing models will face a dramatic performance drop when they are tested against unseen data from new users. Moreover, recording enough data for each new user is unviable due to the limitations and challenges of working with human users. Transfer learning techniques aim to transfer the knowledge which has been learned from the source domain (subject) to the target domain in order to decrease the models' performance loss in the target domain. This paper presents a novel method of adversarial knowledge transfer named SA-GAN stands for Subject Adaptor GAN which utilizes Generative Adversarial Network framework to perform cross-subject transfer learning in the domain of wearable sensor-based Human Activity Recognition. SA-GAN outperformed other state-of-the-art methods in more than 66% of experiments and showed the second best performance in the remaining 25% of experiments. In some cases, it reached up to 90% of the accuracy which can be obtained by supervised training over the same domain data.