Goto

Collaborating Authors

 Transfer Learning


Data Efficient Lithography Modeling with Transfer Learning and Active Data Selection

arXiv.org Machine Learning

Lithography simulation is one of the key steps in physical verification, enabled by the substantial optical and resist models. A resist model bridges the aerial image simulation to printed patterns. While the effectiveness of learning-based solutions for resist modeling has been demonstrated, they are considerably data-demanding. Meanwhile, a set of manufactured data for a specific lithography configuration is only valid for the training of one single model, indicating low data efficiency. Due to the complexity of the manufacturing process, obtaining enough data for acceptable accuracy becomes very expensive in terms of both time and cost, especially during the evolution of technology generations when the design space is intensively explored. In this work, we propose a new resist modeling framework for contact layers, utilizing existing data from old technology nodes and active selection of data in a target technology node, to reduce the amount of data required from the target lithography configuration. Our framework based on transfer learning and active learning techniques is effective within a competitive range of accuracy, i.e., 3-10X reduction on the amount of training data with comparable accuracy to the state-of-the-art learning approach.


Understanding Social Networks Using Transfer Learning

IEEE Computer

Akin to human transfer of experiences, transfer learning as a subfield of machine learning adapts knowledge acquired in one domain to a new domain. The authors systematically investigate how this concept might be applied to the study of users on emerging Web platforms, proposing a transfer learning–based approach, TraNet.


Modular meta-learning

arXiv.org Machine Learning

In many situations, such as robot-learning, training experience is very expensive. One strategy for reducing the amount of training data needed for a new task is to learn some form of prior or bias using data from several related tasks. The objective of this process is to extract information that will substantially reduce the training-data requirements for a new task. This problem is a form of transfer learning, sometimes also called meta-learning or "learning to learn" [1, 2]. Previous approaches to meta-learning for robotics have focused on finding distributions over [3] or initial values of [4, 5] parameters, based on a set of "training tasks," that will enable a new "test task" to be learned with many fewer training examples. Our objective is similar, but rather than focusing on transferring information about parameter values, we focus on finding a reusable set of modules that can form components of a solution to a new task, possibly with a small amount of tuning. Modular approaches to learning have been very successful in structured tasks such as naturallanguage sentence interpretation [6], in which the input signal gives relatively direct information about a good structural decomposition of the problem. We wish to address problems that may benefit from a modular decomposition but do not provide any task-level input from which the structure of a solution may be derived. Nonetheless, we adopt a similar modular structure and parameteradaptation method for learning our reusable modules, but use a general-purpose simulated-annealing search strategy to find an appropriate structural decomposition for each new task.


Domain Adaptation for Infection Prediction from Symptoms Based on Data from Different Study Designs and Contexts

arXiv.org Machine Learning

Acute respiratory infections have epidemic and pandemic potential and thus are being studied worldwide, albeit in many different contexts and study formats. Predicting infection from symptom data is critical, though using symptom data from varied studies in aggregate is challenging because the data is collected in different ways. Accordingly, different symptom profiles could be more predictive in certain studies, or even symptoms of the same name could have different meanings in different contexts. We assess state-of-the-art transfer learning methods for improving prediction of infection from symptom data in multiple types of health care data ranging from clinical, to home-visit as well as crowdsourced studies. We show interesting characteristics regarding six different study types and their feature domains. Further, we demonstrate that it is possible to use data collected from one study to predict infection in another, at close to or better than using a single dataset for prediction on itself. We also investigate in which conditions specific transfer learning and domain adaptation methods may perform better on symptom data. This work has the potential for broad applicability as we show how it is possible to transfer learning from one public health study design to another, and data collected from one study may be used for prediction of labels for another, even collected through different study designs, populations and contexts.


Curriculum Learning by Transfer Learning: Theory and Experiments with Deep Networks

arXiv.org Artificial Intelligence

We provide theoretical investigation of curriculum learning in the context of stochastic gradient descent when optimizing the convex linear regression loss. We prove that the rate of convergence of an ideal curriculum learning method is monotonically increasing with the difficulty of the examples. Moreover, among all equally difficult points, convergence is faster when using points which incur higher loss with respect to the current hypothesis. We then analyze curriculum learning in the context of training a CNN. We describe a method which infers the curriculum by way of transfer learning from another network, pre-trained on a different task. While this approach can only approximate the ideal curriculum, we observe empirically similar behavior to the one predicted by the theory, namely, a significant boost in convergence speed at the beginning of training. When the task is made more difficult, improvement in generalization performance is also observed. Finally, curriculum learning exhibits robustness against unfavorable conditions such as excessive regularization.


Adapted Deep Embeddings: A Synthesis of Methods for $k$-Shot Inductive Transfer Learning

arXiv.org Machine Learning

The focus in machine learning has branched beyond training classifiers on a single task to investigating how previously acquired knowledge in a source domain can be leveraged to facilitate learning in a related target domain, known as inductive transfer learning. Three active lines of research have independently explored transfer learning using neural networks. In weight transfer, a model trained on the source domain is used as an initialization point for a network to be trained on the target domain. In deep metric learning, the source domain is used to construct an embedding that captures class structure in both the source and target domains. In few-shot learning, the focus is on generalizing well in the target domain based on a limited number of labeled examples. We compare state-of-the-art methods from these three paradigms and also explore hybrid adapted-embedding methods that use limited target-domain data to fine tune embeddings constructed from source-domain data. We conduct a systematic comparison of methods in a variety of domains, varying the number of labeled instances available in the target domain ($k$), as well as the number of target-domain classes. We reach three principal conclusions: (1) Deep embeddings are far superior, compared to weight transfer, as a starting point for inter-domain transfer or model re-use (2) Our hybrid methods robustly outperform every few-shot learning and every deep metric learning method previously proposed, with a mean error reduction of 30% over state-of-the-art. (3) Among loss functions for discovering embeddings, the histogram loss (Ustinova & Lempitsky, 2016) is most robust. We hope our results will motivate a unification of research in weight transfer, deep metric learning, and few-shot learning.


Investigating the Impact of Data Volume and Domain Similarity on Transfer Learning Applications

arXiv.org Artificial Intelligence

Transfer learning allows practitioners to recognize and apply knowledge learned in previous tasks (source task) to new tasks or new domains (target task), which share some commonality. The two important factors impacting the performance of transfer learning models are: (a) the size of the target dataset, and (b) the similarity in distribution between source and target domains. Thus far, there has been little investigation into just how important these factors are. In this paper, we investigate the impact of target dataset size and source/target domain similarity on model performance through a series of experiments. We find that more data is always beneficial, and model performance improves linearly with the log of data size, until we are out of data. As source/target domains differ, more data is required and fine tuning will render better performance than feature extraction. When source/target domains are similar and data size is small, fine tuning and feature extraction renders equivalent performance. Our hope is that by beginning this quantitative investigation on the effect of data volume and domain similarity in transfer learning we might inspire others to explore the significance of data in developing more accurate statistical models.


SOSELETO: A Unified Approach to Transfer Learning and Training with Noisy Labels

arXiv.org Artificial Intelligence

We present SOSELETO (SOurce SELEction for Target Optimization), a new method for exploiting a source dataset to solve a classification problem on a target dataset. SOSELETO is based on the following simple intuition: some source examples are more informative than others for the target problem. To capture this intuition, source samples are each given weights; these weights are solved for jointly with the source and target classification problems via a bilevel optimization scheme. The target therefore gets to choose the source samples which are most informative for its own classification task. Furthermore, the bilevel nature of the optimization acts as a kind of regularization on the target, mitigating overfitting. SOSELETO may be applied to both classic transfer learning, as well as the problem of training on datasets with noisy labels; we show state of the art results on both of these problems.


Active Semi-supervised Transfer Learning (ASTL) for Offline BCI Calibration

arXiv.org Machine Learning

Single-trial classification of event-related potentials in electroencephalogram (EEG) signals is a very important paradigm of brain-computer interface (BCI). Because of individual differences, usually some subject-specific calibration data are required to tailor the classifier for each subject. Transfer learning has been extensively used to reduce such calibration data requirement, by making use of auxiliary data from similar/relevant subjects/tasks. However, all previous research assumes that all auxiliary data have been labeled. This paper considers a more general scenario, in which part of the auxiliary data could be unlabeled. We propose active semi-supervised transfer learning (ASTL) for offline BCI calibration, which integrates active learning, semi-supervised learning, and transfer learning. Using a visual evoked potential oddball task and three different EEG headsets, we demonstrate that ASTL can achieve consistently good performance across subjects and headsets, and it outperforms some state-of-the-art approaches in the literature.


Dropping Networks for Transfer Learning

arXiv.org Machine Learning

In natural language understanding, many challenges require learning relationships between two sequences for various tasks such as similarity, relatedness, paraphrasing and question matching. Some of these challenges are inherently closer in nature, hence the knowledge acquired from one task to another is easier acquired and adapted. However, transferring all knowledge might be undesired and can lead to sub-optimal results due to \textit{negative} transfer. Hence, this paper focuses on the transferability of both instances and parameters across natural language understanding tasks using an ensemble-based transfer learning method to circumvent such issues. The primary contribution of this paper is the combination of both \textit{Dropout} and \textit{Bagging} for improved transferability in neural networks, referred to as \textit{Dropping} herein. Secondly, we present a straightforward yet novel approach to incorporating source \textit{Dropping} Networks to a target task for few-shot learning that mitigates \textit{negative} transfer. This is achieved by using a decaying parameter chosen according to the slope changes of a smoothed spline error curve at sub-intervals during training. We compare the approach over the hard parameter sharing, soft parameter sharing and single-task learning to compare its effectiveness. The aforementioned adjustment leads to improved transfer learning performance and comparable results to the current state of the art only using few instances from the target task.