Goto

Collaborating Authors

 Transfer Learning


Transfer Learning on Heterogeneous Feature Spaces for Treatment Effects Estimation

arXiv.org Artificial Intelligence

Consider the problem of improving the estimation of conditional average treatment effects (CATE) for a target domain of interest by leveraging related information from a source domain with a different feature space. This heterogeneous transfer learning problem for CATE estimation is ubiquitous in areas such as healthcare where we may wish to evaluate the effectiveness of a treatment for a new patient population for which different clinical covariates and limited data are available. In this paper, we address this problem by introducing several building blocks that use representation learning to handle the heterogeneous feature spaces and a flexible multi-task architecture with shared and private layers to transfer information between potential outcome functions across domains. Then, we show how these building blocks can be used to recover transfer learning equivalents of the standard CATE learners. On a new semi-synthetic data simulation benchmark for heterogeneous transfer learning we not only demonstrate performance improvements of our heterogeneous transfer causal effect learners across datasets, but also provide insights into the differences between these learners from a transfer perspective.


The Power of Transfer Learning in Agricultural Applications: AgriNet

arXiv.org Artificial Intelligence

Advances in deep learning and transfer learning have paved the way for various automation classification tasks in agriculture, including plant diseases, pests, weeds, and plant species detection. However, agriculture automation still faces various challenges, such as the limited size of datasets and the absence of plant-domain-specific pretrained models. Domain specific pretrained models have shown state of art performance in various computer vision tasks including face recognition and medical imaging diagnosis. In this paper, we propose AgriNet dataset, a collection of 160k agricultural images from more than 19 geographical locations, several images captioning devices, and more than 423 classes of plant species and diseases. We also introduce AgriNet models, a set of pretrained models on five ImageNet architectures: VGG16, VGG19, Inception-v3, InceptionResNet-v2, and Xception. AgriNet-VGG19 achieved the highest classification accuracy of 94 % and the highest F1-score of 92%. Additionally, all proposed models were found to accurately classify the 423 classes of plant species, diseases, pests, and weeds with a minimum accuracy of 87% for the Inception-v3 model.Finally, experiments to evaluate of superiority of AgriNet models compared to ImageNet models were conducted on two external datasets: pest and plant diseases dataset from Bangladesh and a plant diseases dataset from Kashmir.


Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus

arXiv.org Artificial Intelligence

Training a text-to-speech (TTS) model requires a large scale text labeled speech corpus, which is troublesome to collect. In this paper, we propose a transfer learning framework for TTS that utilizes a large amount of unlabeled speech dataset for pre-training. By leveraging wav2vec2.0 representation, unlabeled speech can highly improve performance, especially in the lack of labeled speech. We also extend the proposed method to zero-shot multi-speaker TTS (ZS-TTS). The experimental results verify the effectiveness of the proposed method in terms of naturalness, intelligibility, and speaker generalization. We highlight that the single speaker TTS model fine-tuned on the only 10 minutes of labeled dataset outperforms the other baselines, and the ZS-TTS model fine-tuned on the only 30 minutes of single speaker dataset can generate the voice of the arbitrary speaker, by pre-training on unlabeled multi-speaker speech corpus.


Transfer Learning for Time Series Forecasting

#artificialintelligence

In this article, we will see how transfer learning can be applied to time series forecasting, and how forecasting models can be trained once on a diverse time series dataset and used later on to obtain forecasts on different datasets without training. We will use the open-source Darts library to do all this with in a few lines of code. A self-contained notebook containing everything needed to reproduce the results is available here. Time series forecasting has numerous applications in supply chain, energy, agriculture, control, IT operations, finance and other domains. For a long time, the best-performing approaches were relatively sophisticated statistical methods such as Exponential Smoothing or ARIMA. However, since recently, machine learning and deep learning have started to outperform these classical approaches on a number of forecasting tasks and competitions.


Star-Graph Multimodal Matching Component Analysis for Data Fusion and Transfer Learning

arXiv.org Artificial Intelligence

The matching component analysis (MCA) technique for transfer learning [1] finds two maps - one from each of two data domains to a lower-dimensional, common domain - using only a small number of matched data pairs, where each matched data pair is comprised of one point from each data domain. These maps minimize the expected distance between mapped data pairs within the common domain, subject to an identity matrix covariance constraint and an affine linear structure. Learning techniques can then be applied to matched data points after they are mapped to the common domain, where each such point is encoded with information from both data domains via its respective optimal affine linear transformation. In [2], the covariance-generalized MCA (CGMCA) technique was developed in order to allow for the encoding of additional statistical information into the MCA maps. This was done by generalizing the identity matrix covariance constraint of MCA to accommodate any covariance matrix (compare Figures 1a and 1b). We are interested in extending the application space of CGMCA to accommodate three or more data domains simultaneously.


Neural Style Transfer -- A practice in transfer learning

#artificialintelligence

The picture above is photoed by me in London. There is always an idea that pops up in my mind when I looked at it -- what if I make this picture into an oil painting? It must be a masterpiece! Thanks to Gatys et al. their article helped me dive into the beauty of Deep learning, and this whole article is based on their paper. Before we begin, let's talk something interesting: What are deep convnets really learning?


Feature-based Transfer Learning vs Fine Tuning?

#artificialintelligence

You can follow me on Linkedin! Note: There are different angles to answer an interview question. The author of this newsletter does not try to find a reference that answers a question exhaustively. Rather, the author would like to share some quick insights and help the readers to think, practice and do further research as necessary.


Transfer Learning with Pre-trained Conditional Generative Models

arXiv.org Artificial Intelligence

Transfer learning is crucial in training deep neural networks on new target tasks. Current transfer learning methods always assume at least one of (i) source and target task label spaces overlap, (ii) source datasets are available, and (iii) target network architectures are consistent with source ones. However, holding these assumptions is difficult in practical settings because the target task rarely has the same labels as the source task, the source dataset access is restricted due to storage costs and privacy, and the target architecture is often specialized to each task. To transfer source knowledge without these assumptions, we propose a transfer learning method that uses deep generative models and is composed of the following two stages: pseudo pre-training (PP) and pseudo semi-supervised learning (P-SSL). PP trains a target architecture with an artificial dataset synthesized by using conditional source generative models. P-SSL applies SSL algorithms to labeled target data and unlabeled pseudo samples, which are generated by cascading the source classifier and generative models to condition them with target samples. Our experimental results indicate that our method can outperform the baselines of scratch training and knowledge distillation. For training deep neural networks on new tasks, transfer learning is essential, which leverages the knowledge of related (source) tasks to the new (target) tasks via the joint-or pre-training of source models. There are many transfer learning methods for deep models under various conditions (Pan & Yang, 2010; Wang & Deng, 2018). For instance, domain adaptation leverages source knowledge to the target task by minimizing the domain gaps (Ganin et al., 2016), and fine-tuning uses the pre-trained weights on source tasks as the initial weights of the target models (Yosinski et al., 2014).


An Efficient Multitask Learning Architecture for Affective Vocal Burst Analysis

arXiv.org Artificial Intelligence

Affective speech analysis is an ongoing topic of research. A relatively new problem in this field is the analysis of vocal bursts, which are nonverbal vocalisations such as laughs or sighs. Current state-of-the-art approaches to address affective vocal burst analysis are mostly based on wav2vec2 or HuBERT features. In this paper, we investigate the use of the wav2vec successor data2vec in combination with a multitask learning pipeline to tackle different analysis problems at once. To assess the performance of our efficient multitask learning architecture, we participate in the 2022 ACII Affective Vocal Burst Challenge, showing that our approach substantially outperforms the baseline established there in three different subtasks.


SHiFT: An Efficient, Flexible Search Engine for Transfer Learning

arXiv.org Artificial Intelligence

Transfer learning can be seen as a data- and compute-efficient alternative to training models from scratch. The emergence of rich model repositories, such as TensorFlow Hub, enables practitioners and researchers to unleash the potential of these models across a wide range of downstream tasks. As these repositories keep growing exponentially, efficiently selecting a good model for the task at hand becomes paramount. By carefully comparing various selection and search strategies, we realize that no single method outperforms the others, and hybrid or mixed strategies can be beneficial. Therefore, we propose SHiFT, the first downstream task-aware, flexible, and efficient model search engine for transfer learning. These properties are enabled by a custom query language SHiFT-QL together with a cost-based decision maker, which we empirically validate. Motivated by the iterative nature of machine learning development, we further support efficient incremental executions of our queries, which requires a careful implementation when jointly used with our optimizations.