Goto

Collaborating Authors

 tl method


Evaluating Transfer Learning Methods on Real-World Data Streams: A Case Study in Financial Fraud Detection

arXiv.org Artificial Intelligence

When the available data for a target domain is limited, transfer learning (TL) methods can be used to develop models on related data-rich domains, before deploying them on the target domain. However, these TL methods are typically designed with specific, static assumptions on the amount of available labeled and unlabeled target data. This is in contrast with many real world applications, where the availability of data and corresponding labels varies over time. Since the evaluation of the TL methods is typically also performed under the same static data availability assumptions, this would lead to unrealistic expectations concerning their performance in real world settings. To support a more realistic evaluation and comparison of TL algorithms and models, we propose a data manipulation framework that (1) simulates varying data availability scenarios over time, (2) creates multiple domains through resampling of a given dataset and (3) introduces inter-domain variability by applying realistic domain transformations, e.g., creating a variety of potentially time-dependent covariate and concept shifts. These capabilities enable simulation of a large number of realistic variants of the experiments, in turn providing more information about the potential behavior of algorithms when deployed in dynamic settings. We demonstrate the usefulness of the proposed framework by performing a case study on a proprietary real-world suite of card payment datasets. Given the confidential nature of the case study, we also illustrate the use of the framework on the publicly available Bank Account Fraud (BAF) dataset. By providing a methodology for evaluating TL methods over time and in realistic data availability scenarios, our framework facilitates understanding of the behavior of models and algorithms. This leads to better decision making when deploying models for new domains in real-world environments.


Bridging Data Gaps in Healthcare: A Scoping Review of Transfer Learning in Biomedical Data Analysis

arXiv.org Artificial Intelligence

Clinical and biomedical research in low-resource settings often faces significant challenges due to the need for high-quality data with sufficient sample sizes to construct effective models. These constraints hinder robust model training and prompt researchers to seek methods for leveraging existing knowledge from related studies to support new research efforts. Transfer learning (TL), a machine learning technique, emerges as a powerful solution by utilizing knowledge from pre-trained models to enhance the performance of new models, offering promise across various healthcare domains. Despite its conceptual origins in the 1990s, the application of TL in medical research has remained limited, especially beyond image analysis. In our review of TL applications in structured clinical and biomedical data, we screened 3,515 papers, with 55 meeting the inclusion criteria. Among these, only 2% (one out of 55) utilized external studies, and 7% (four out of 55) addressed scenarios involving multi-site collaborations with privacy constraints. To achieve actionable TL with structured medical data while addressing regional disparities, inequality, and privacy constraints in healthcare research, we advocate for the careful identification of appropriate source data and models, the selection of suitable TL frameworks, and the validation of TL models with proper baselines.


Selecting Subsets of Source Data for Transfer Learning with Applications in Metal Additive Manufacturing

arXiv.org Artificial Intelligence

ABSTRACT Considering data insufficiency in metal additive manufacturing (AM), transfer learning (TL) has been adopted to extract knowledge from source domains (e.g., completed printings) to improve the modeling performance in target domains (e.g., new printings). Current applications use all accessible source data directly in TL with no regard to the similarity between source and target data. This paper proposes a systematic method to find appropriate subsets of source data based on similarities between the source and target datasets for a given set of limited target domain data. Such similarity is characterized by the spatial and model distance metrics. A Pareto frontier-based source data selection method is developed, where the source data located on the Pareto frontier defined by two similarity distance metrics are selected iteratively. The method is integrated into an instance-based TL method (decision tree regression model) and a model-based TL method (fine-tuned artificial neural network). Both models are then tested on several regression tasks in metal AM. Comparison results demonstrate that 1) the source data selection method is general and supports integration with various TL methods and distance metrics, 2) compared with using all source data, the proposed method can find a small subset of source data from the same domain with better TL performance in metal AM regression tasks involving different processes and machines, and 3) when multiple source domains exist, the source data selection method could find the subset from one source domain to obtain comparable or better TL performance than the model constructed using data from all source domains. Keywords: metal additive manufacturing, transfer learning, source data selection, Pareto frontier 1 Introduction Metal additive manufacturing (AM) fabricates parts by depositing metal materials layer by layer with various heat sources, e.g., the laser beam and electric arc. Although metal AM has been adopted in electronics (Pang et al. 2020), automotive (Vasco 2021), aerospace (Blakey-Milner et al. 2021), and other industries, low productivity and unstable quality are two drawbacks that restrict the applications of metal AM. To alleviate the two drawbacks, constructing data-driven models to reveal correlations among processes, structures, and properties has attracted attention in both industry and academia. These models are built based on collected data from experiments or simulations and adopted for process optimization, control, or monitoring to improve the quality of printed parts.


Rethinking Transfer Learning for Medical Image Classification

arXiv.org Artificial Intelligence

Transfer learning (TL) from pretrained deep models is a standard practice in modern medical image classification (MIC). However, what levels of features to be reused are problem-dependent, and uniformly finetuning all layers of pretrained models may be suboptimal. This insight has partly motivated the recent differential TL strategies, such as TransFusion (TF) and layer-wise finetuning (LWFT), which treat the layers in the pretrained models differentially. In this paper, we add one more strategy into this family, called TruncatedTL, which reuses and finetunes appropriate bottom layers and directly discards the remaining layers. This yields not only superior MIC performance but also compact models for efficient inference, compared to other differential TL methods. Our code is available at: https://github.com/sun-umn/TTL


Comparison of Transfer Learning based Additive Manufacturing Models via A Case Study

arXiv.org Artificial Intelligence

Transfer learning (TL) based additive manufacturing (AM) modeling is an emerging field to reuse the data from historical products and mitigate the data insufficiency in modeling new products. Although some trials have been conducted recently, the inherent challenges of applying TL in AM modeling are seldom discussed, e.g., which source domain to use, how much target data is needed, and whether to apply data preprocessing techniques. This paper aims to answer those questions through a case study defined based on an open-source dataset about metal AM products. In the case study, five TL methods are integrated with decision tree regression (DTR) and artificial neural network (ANN) to construct six TL-based models, whose performances are then compared with the baseline DTR and ANN in a proposed validation framework. The comparisons are used to quantify the performance of applied TL methods and are discussed from the perspective of similarity, training data size, and data preprocessing. Finally, the source AM domain with larger qualitative similarity and a certain range of target-to-source training data size ratio are recommended. Besides, the data preprocessing should be performed carefully to balance the modeling performance and the performance improvement due to TL.


Ensemble Transfer Learning for Multilingual Coreference Resolution

arXiv.org Artificial Intelligence

Entity coreference resolution is an important research problem with many applications, including information extraction and question answering. Coreference resolution for English has been studied extensively. However, there is relatively little work for other languages. A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data. To overcome this challenge, we design a simple but effective ensemble-based framework that combines various transfer learning (TL) techniques. We first train several models using different TL methods. Then, during inference, we compute the unweighted average scores of the models' predictions to extract the final set of predicted clusters. Furthermore, we also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts. Leveraging the idea that the coreferential links naturally exist between anchor texts pointing to the same article, our method builds a sizeable distantly-supervised dataset for the target language that consists of tens of thousands of documents. We can pre-train a model on the pseudo-labeled dataset before finetuning it on the final target dataset. Experimental results on two benchmark datasets, OntoNotes and SemEval, confirm the effectiveness of our methods. Our best ensembles consistently outperform the baseline approach of simple training by up to 7.68% in the F1 score. These ensembles also achieve new state-of-the-art results for three languages: Arabic, Dutch, and Spanish.



Easy Transfer Learning By Exploiting Intra-domain Structures

arXiv.org Machine Learning

Transfer learning aims at transferring knowledge from a well-labeled domain to a similar but different domain with limited or no labels. Unfortunately, existing learning-based methods often involve intensive model selection and hyperparameter tuning to obtain good results. Moreover, cross-validation is not possible for tuning hyperparameters since there are often no labels in the target domain. This would restrict wide applicability of transfer learning especially in computationally-constraint devices such as wearables. In this paper, we propose a practically Easy Transfer Learning (EasyTL) approach which requires no model selection and hyperparameter tuning, while achieving competitive performance. By exploiting intra-domain structures, EasyTL is able to learn both non-parametric transfer features and classifiers. Extensive experiments demonstrate that, compared to state-of-the-art traditional and deep methods, EasyTL satisfies the Occam's Razor principle: it is extremely easy to implement and use while achieving comparable or better performance in classification accuracy and much better computational efficiency. Additionally, it is shown that EasyTL can increase the performance of existing transfer feature learning methods.


Cross-Domain Kernel Induction for Transfer Learning

AAAI Conferences

The key question in transfer learning (TL) research is how to make model induction transferable across different domains. Common methods so far require source and target domains to have a shared/homogeneous feature space, or the projection of features from heterogeneous domains onto a shared space. This paper proposes a novel framework, which does not require a shared feature space but instead uses a parallel corpus to calibrate domain-specific kernels into a unified kernel, to leverage graph-based label propagation in cross-domain settings, and to optimize semi-supervised learning based on labeled and unlabeled data in both source and target domains. Our experiments on benchmark datasets show advantageous performance of the proposed method over that of other state-of-the-art TL methods.