Transfer Learning
Homogeneous Online Transfer Learning with Online Distribution Discrepancy Minimization
Du, Yuntao, Tan, Zhiwen, Chen, Qian, Zhang, Yi, Wang, Chongjun
Transfer learning has been demonstrated to be successful and essential in diverse applications, which transfers knowledge from related but different source domains to the target domain. Online transfer learning(OTL) is a more challenging problem where the target data arrive in an online manner. Most OTL methods combine source classifier and target classifier directly by assigning a weight to each classifier, and adjust the weights constantly. However, these methods pay little attention to reducing the distribution discrepancy between domains. In this paper, we propose a novel online transfer learning method which seeks to find a new feature representation, so that the marginal distribution and conditional distribution discrepancy can be online reduced simultaneously. We focus on online transfer learning with multiple source domains and use the Hedge strategy to leverage knowledge from source domains. We analyze the theoretical properties of the proposed algorithm and provide an upper mistake bound. Comprehensive experiments on two real-world datasets show that our method outperforms state-of-the-art methods by a large margin.
Quantifying the Performance of Federated Transfer Learning
Jing, Qinghe, Wang, Weiyan, Zhang, Junxue, Tian, Han, Chen, Kai
The scarcity of data and isolated data islands encourage different organizations to share data with each other to train machine learning models. However, there are increasing concerns on the problems of data privacy and security, which urges people to seek a solution like Federated Transfer Learning (FTL) to share training data without violating data privacy. FTL leverages transfer learning techniques to utilize data from different sources for training, while achieving data privacy protection without significant accuracy loss. However, the benefits come with a cost of extra computation and communication consumption, resulting in efficiency problems. In order to efficiently deploy and scale up FTL solutions in practice, we need a deep understanding on how the infrastructure affects the efficiency of FTL. Our paper tries to answer this question by quantitatively measuring a real-world FTL implementation FATE on Google Cloud. According to the results of carefully designed experiments, we verified that the following bottlenecks can be further optimized: 1) Inter-process communication is the major bottleneck; 2) Data encryption adds considerable computation overhead; 3) The Internet networking condition affects the performance a lot when the model is large.
Transfer Learning in General Lensless Imaging through Scattering Media
Yang, Yukuan, Deng, Lei, Jiao, Peng, Chua, Yansong, Pei, Jing, Ma, Cheng, Li, Guoqi
Recently deep neural networks (DNNs) have been successfully introduced to the field of lensless imaging through scattering media. By solving an inverse problem in computational imaging, DNNs can overcome several shortcomings in the conventional lensless imaging through scattering media methods, namely, high cost, poor quality, complex control, and poor anti-interference. However, for training, a large number of training samples on various datasets have to be collected, with a DNN trained on one dataset generally performing poorly for recovering images from another dataset. The underlying reason is that lensless imaging through scattering media is a high dimensional regression problem and it is difficult to obtain an analytical solution. In this work, transfer learning is proposed to address this issue. Our main idea is to train a DNN on a relatively complex dataset using a large number of training samples and fine-tune the last few layers using very few samples from other datasets. Instead of the thousands of samples required to train from scratch, transfer learning alleviates the problem of costly data acquisition. Specifically, considering the difference in sample sizes and similarity among datasets, we propose two DNN architectures, namely LISMU-FCN and LISMU-OCN, and a balance loss function designed for balancing smoothness and sharpness. LISMU-FCN, with much fewer parameters, can achieve imaging across similar datasets while LISMU-OCN can achieve imaging across significantly different datasets. What's more, we establish a set of simulation algorithms which are close to the real experiment, and it is of great significance and practical value in the research on lensless scattering imaging. In summary, this work provides a new solution for lensless imaging through scattering media using transfer learning in DNNs.
mRMR-DNN with Transfer Learning for IntelligentFault Diagnosis of Rotating Machines
Singh, Vikas, Verma, Nishchal K.
In recent years, intelligent condition-based monitoring of rotary machinery systems has become a major research focus of machine fault diagnosis. In condition-based monitoring, it is challenging to form a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation. Along with that, the generated data have a large number of redundant features which degraded the performance of the machine learning models. To overcome this, we have utilized the advantages of minimum redundancy maximum relevance (mRMR) and transfer learning with deep learning model. In this work, mRMR is combined with deep learning and deep transfer learning framework to improve the fault diagnostics performance in term of accuracy and computational complexity. The mRMR reduces the redundant information from data and increases the deep learning performance, whereas transfer learning, reduces a large amount of data dependency for training the model. In the proposed work, two frameworks, i.e., mRMR with deep learning and mRMR with deep transfer learning, have explored and validated on CWRU and IMS rolling element bearings datasets. The analysis shows that the proposed frameworks are able to obtain better diagnostic accuracy in comparison of existing methods and also able to handle the data with a large number of features more quickly.
Understanding Transfer Learning For Medical Applications
From interpreting chest x-rays to identifying eye diseases, the domain of transfer learning has found its significance in a variety of standard medical tasks. Therefore, it is extremely important to understand the commonly held assumptions, challenges and other solutions within the realms of transfer learning. A common practice in medical imaging tasks is to start with a large image of a bodily region of interest and identify diseases by identifying the variations in local textures in the images. For example, in retinal fundus images, small red'dots' means presence of microaneurysms and diabetic retinopathy, and in chest x-rays local white opaque patches are signs of consolidation and pneumonia. This is in contrast to natural image datasets like ImageNet, where there is often a clear global subject of the image. There is thus a myriad of open questions unattended such as how much ImageNet feature reuse is helpful for medical images amongst many others.
Google proposes hybrid approach to AI transfer learning for medical imaging
Medical imaging is among the most popular application of AI and machine learning, and with good reason. Computer vision algorithms are naturally adept at spotting anomalies experts sometimes miss, in the process reducing wait times and lightening clinical workloads. Perhaps that's why although the percentage of health care organizations that have adopted AI remains relatively low (22%) globally, the majority of practitioners (77%) believe the technology is important to the medical imaging field as a whole. Unsurprisingly, data scientists have devoted outsize time and attention to developing AI imaging models for use in health care systems, a few of which Google scientists detail in a paper accepted to this week's NeurIPS conference in Vancouver. In "Transfusion: Understanding Transfer Learning for Medical Imaging," coauthors hailing from Google Research (the R&D-focused arm of Google's business) investigate the role transfer learning plays in developing image classification algorithms.
Transfer Learning-Based Outdoor Position Recovery with Telco Data
Zhang, Yige, Ding, Aaron Yi, Ott, Jorg, Yuan, Mingxuan, Zeng, Jia, Zhang, Kun, Rao, Weixiong
Telecommunication (Telco) outdoor position recovery aims to localize outdoor mobile devices by leveraging measurement report (MR) data. Unfortunately, Telco position recovery requires sufficient amount of MR samples across different areas and suffers from high data collection cost. For an area with scarce MR samples, it is hard to achieve good accuracy. In this paper, by leveraging the recently developed transfer learning techniques, we design a novel Telco position recovery framework, called TLoc, to transfer good models in the carefully selected source domains (those fine-grained small subareas) to a target one which originally suffers from poor localization accuracy. Specifically, TLoc introduces three dedicated components: 1) a new coordinate space to divide an area of interest into smaller domains, 2) a similarity measurement to select best source domains, and 3) an adaptation of an existing transfer learning approach. To the best of our knowledge, TLoc is the first framework that demonstrates the efficacy of applying transfer learning in the Telco outdoor position recovery. To exemplify, on the 2G GSM and 4G LTE MR datasets in Shanghai, TLoc outperforms a nontransfer approach by 27.58% and 26.12% less median errors, and further leads to 47.77% and 49.22% less median errors than a recent fingerprinting approach NBL.
Unsupervised Transfer Learning via BERT Neuron Selection
Valipour, Mehrdad, Lee, En-Shiun Annie, Jamacaro, Jaime R., Bessega, Carolina
Recent advancements in language representation models such as BERT have led to a rapid improvement in numerous natural language processing tasks. However, language models usually consist of a few hundred million trainable parameters with embedding space distributed across multiple layers, thus making them challenging to be fine-tuned for a specific task or to be transferred to a new domain. To determine whether there are task-specific neurons that can be exploited for unsupervised transfer learning, we introduce a method for selecting the most important neurons to solve a specific classification task. This algorithm is further extended to multi-source transfer learning by computing the importance of neurons for several single-source transfer learning scenarios between different subsets of data sources. Besides, a task-specific fingerprint for each data source is obtained based on the percentage of the selected neurons in each layer. We perform extensive experiments in unsupervised transfer learning for sentiment analysis, natural language inference and sentence similarity, and compare our results with the existing literature and baselines. Significantly, we found that the source and target data sources with higher degrees of similarity between their task-specific fingerprints demonstrate a better transferability property. We conclude that our method can lead to better performance using just a few hundred task-specific and interpretable neurons.
salesforce/decaNLP
The Natural Language Decathlon is a multitask challenge that spans ten tasks: question answering (SQuAD), machine translation (IWSLT), summarization (CNN/DM), natural language inference (MNLI), sentiment analysis (SST), semantic role labeling(QA‑SRL), zero-shot relation extraction (QA‑ZRE), goal-oriented dialogue (WOZ, semantic parsing (WikiSQL), and commonsense reasoning (MWSC). Each task is cast as question answering, which makes it possible to use our new Multitask Question Answering Network (MQAN). This model jointly learns all tasks in decaNLP without any task-specific modules or parameters in the multitask setting. For a more thorough introduction to decaNLP and the tasks, see the main website, our blog post, or the paper. While the research direction associated with this repository focused on multitask learning, the framework itself is designed in a way that should make single-task training, transfer learning, and zero-shot evaluation simple.
Transferability versus Discriminability: Joint Probability Distribution Adaptation (JPDA)
Transfer learning makes use of data or knowledge in one task to help solve a different, yet related, task. Many ex isting TL approaches are based on a joint probability distribution metric, which is a weighted sum of the marginal distribution and the c ondi-tional distribution; however, they optimize the two distri butions independently, and ignore their intrinsic dependency. This p aper proposes a novel and frustratingly easy Joint Probability Dist ribution Adaptation (JPDA) approach, to replace the frequently-use d joint maximum mean discrepancy metric in transfer learning. Duri ng the distribution adaptation, JPDA improves the transferabili ty between the source and the target domains by minimizing the joint pro b-ability discrepancy of the corresponding class, and also in creases the discriminability between different classes by maximiz ing their joint probability discrepancy. Experiments on six image cl assifica-tion datasets demonstrated that JPDA outperforms several s tate-of- the-art metric-based transfer learning approaches.