AITopics | Transfer Learning

Collaborating Authors

Transfer Learning

Transfer Learning is the reuse of a pre-trained model on a new problem. (Towards Data Science)

News Overviews Instructional Materials AI-Alerts Classics

Risk of Transfer Learning and its Applications in Finance

Cao, Haoyang, Gu, Haotian, Guo, Xin, Rosenbaum, Mathieu

arXiv.org Artificial IntelligenceNov-6-2023

Transfer learning is an emerging and popular paradigm for utilizing existing knowledge from previous learning tasks to improve the performance of new ones. In this paper, we propose a novel concept of transfer risk and and analyze its properties to evaluate transferability of transfer learning. We apply transfer learning techniques and this concept of transfer risk to stock return prediction and portfolio optimization problems. Numerical results demonstrate a strong correlation between transfer risk and overall transfer learning performance, where transfer risk provides a computationally efficient way to identify appropriate source tasks in transfer learning, including cross-continent, cross-sector, and cross-frequency transfer for portfolio optimization.

artificial intelligence, machine learning, transfer risk, (18 more...)

arXiv.org Artificial Intelligence

2311.03283

Country:

North America > United States (0.46)
Europe (0.28)
Asia (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Banking & Finance > Trading (1.00)
Energy > Oil & Gas > Upstream (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Quantifying the value of information transfer in population-based SHM

Hughes, Aidan J., Poole, Jack, Dervilis, Nikolaos, Gardner, Paul, Worden, Keith

arXiv.org Artificial IntelligenceNov-6-2023

Population-based structural health monitoring (PBSHM), seeks to address some of the limitations associated with data scarcity that arise in traditional SHM. A tenet of the population-based approach to SHM is that information can be shared between sufficiently-similar structures in order to improve predictive models. Transfer learning techniques, such as domain adaptation, have been shown to be a highly-useful technology for sharing information between structures when developing statistical classifiers for PBSHM. Nonetheless, transfer-learning techniques are not without their pitfalls. In some circumstances, for example if the data distributions associated with the structures within a population are dissimilar, applying transfer-learning methods can be detrimental to classification performance -- this phenomenon is known as negative transfer. Given the potentially-severe consequences of negative transfer, it is prudent for engineers to ask the question `when, what, and how should one transfer between structures?'. The current paper aims to demonstrate a transfer-strategy decision process for a classification task for a population of simulated structures in the context of a representative SHM maintenance problem, supported by domain adaptation. The transfer decision framework is based upon the concept of expected value of information transfer. In order to compute the expected value of information transfer, predictions must be made regarding the classification (and decision performance) in the target domain following information transfer. In order to forecast the outcome of transfers, a probabilistic regression is used here to predict classification performance from a proxy for structural similarity based on the modal assurance criterion.

information transfer, source domain, structural similarity, (11 more...)

arXiv.org Artificial Intelligence

2311.03083

Country:

Europe > United Kingdom > England > Cheshire > Warrington (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Consumer Health (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
(2 more...)

Add feedback

Estimation and inference for transfer learning with high-dimensional quantile regression

Huang, Jiayu, Wang, Mingqiu, Wu, Yuanshan

arXiv.org Machine LearningNov-5-2023

Transfer learning has become an essential technique to exploit information from the source domain to boost performance of the target task. Despite the prevalence in high-dimensional data, heterogeneity and heavy tails are insufficiently accounted for by current transfer learning approaches and thus may undermine the resulting performance. We propose a transfer learning procedure in the framework of high-dimensional quantile regression models to accommodate heterogeneity and heavy tails in the source and target domains. We establish error bounds of transfer learning estimator based on delicately selected transferable source domains, showing that lower error bounds can be achieved for critical selection criterion and larger sample size of source tasks. We further propose valid confidence interval and hypothesis test procedures for individual component of high-dimensional quantile regression coefficients by advocating a double transfer learning estimator, which is one-step debiased estimator for the transfer learning estimator wherein the technique of transfer learning is designed again. By adopting data-splitting technique, we advocate a transferability detection approach that guarantees to circumvent negative transfer and identify transferable sources with high probability. Simulation results demonstrate that the proposed method exhibits some favorable and compelling performances and the practical utility is further illustrated by analyzing a real example.

ora, probability, theorem 2, (14 more...)

arXiv.org Machine Learning

2211.14578

Country:

Asia > China > Hunan Province (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Add feedback

To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in Transfer Learning

Sadrtdinov, Ildus, Pozdeev, Dmitrii, Vetrov, Dmitry, Lobacheva, Ekaterina

arXiv.org Machine LearningNov-3-2023

Transfer learning and ensembling are two popular techniques for improving the performance and robustness of neural networks. Due to the high cost of pre-training, ensembles of models fine-tuned from a single pre-trained checkpoint are often used in practice. Such models end up in the same basin of the loss landscape, which we call the pre-train basin, and thus have limited diversity. In this work, we show that ensembles trained from a single pre-trained checkpoint may be improved by better exploring the pre-train basin, however, leaving the basin results in losing the benefits of transfer learning and in degradation of the ensemble quality. Based on the analysis of existing exploration methods, we propose a more effective modification of the Snapshot Ensembles (SSE) for transfer learning setup, StarSSE, which results in stronger ensembles and uniform model soups.

accuracy, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2303.03374

Country:

Asia > Middle East > Jordan (0.04)
Europe > Russia (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Russia (0.04)

Genre: Research Report (1.00)

Industry: Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Transfer learning for improved generalizability in causal physics-informed neural networks for beam simulations

Kapoor, Taniya, Wang, Hongrui, Nunez, Alfredo, Dollevoet, Rolf

arXiv.org Artificial IntelligenceNov-1-2023

This paper introduces a novel methodology for simulating the dynamics of beams on elastic foundations. Specifically, Euler-Bernoulli and Timoshenko beam models on the Winkler foundation are simulated using a transfer learning approach within a causality-respecting physics-informed neural network (PINN) framework. Conventional PINNs encounter challenges in handling large space-time domains, even for problems with closed-form analytical solutions. A causality-respecting PINN loss function is employed to overcome this limitation, effectively capturing the underlying physics. However, it is observed that the causality-respecting PINN lacks generalizability. We propose using solutions to similar problems instead of training from scratch by employing transfer learning while adhering to causality to accelerate convergence and ensure accurate results across diverse scenarios. Numerical experiments on the Euler-Bernoulli beam highlight the efficacy of the proposed approach for various initial conditions, including those with noise in the initial data. Furthermore, the potential of the proposed method is demonstrated for the Timoshenko beam in an extended spatial and temporal domain. Several comparisons suggest that the proposed method accurately captures the inherent dynamics, outperforming the state-of-the-art physics-informed methods under standard $L^2$-norm metric and accelerating convergence.

beam simulation, causal physics-informed neural network, generalizability

arXiv.org Artificial Intelligence

2311.00578

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Transfer Learning in Transformer-Based Demand Forecasting For Home Energy Management System

Gokhale, Gargya, Van Gompel, Jonas, Claessens, Bert, Develder, Chris

arXiv.org Artificial IntelligenceOct-29-2023

Increasingly, homeowners opt for photovoltaic (PV) systems and/or battery storage to minimize their energy bills and maximize renewable energy usage. This has spurred the development of advanced control algorithms that maximally achieve those goals. However, a common challenge faced while developing such controllers is the unavailability of accurate forecasts of household power consumption, especially for shorter time resolutions (15 minutes) and in a data-efficient manner. In this paper, we analyze how transfer learning can help by exploiting data from multiple households to improve a single house's load forecasting. Specifically, we train an advanced forecasting model (a temporal fusion transformer) using data from multiple different households, and then finetune this global model on a new household with limited data (i.e. only a few days). The obtained models are used for forecasting power consumption of the household for the next 24 hours~(day-ahead) at a time resolution of 15 minutes, with the intention of using these forecasts in advanced controllers such as Model Predictive Control. We show the benefit of this transfer learning setup versus solely using the individual new household's data, both in terms of (i) forecasting accuracy ($\sim$15\% MAE reduction) and (ii) control performance ($\sim$2\% energy cost reduction), using real-world household data.

forecast, household, tft model, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3600100.3626635

2310.19159

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Europe > Belgium > Flanders > East Flanders > Ghent (0.05)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable > Solar (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.85)

Add feedback

PETA: Evaluating the Impact of Protein Transfer Learning with Sub-word Tokenization on Downstream Applications

Tan, Yang, Li, Mingchen, Tan, Pan, Zhou, Ziyi, Yu, Huiqun, Fan, Guisheng, Hong, Liang

arXiv.org Artificial IntelligenceOct-26-2023

Large protein language models are adept at capturing the underlying evolutionary information in primary structures, offering significant practical value for protein engineering. Compared to natural language models, protein amino acid sequences have a smaller data volume and a limited combinatorial space. Choosing an appropriate vocabulary size to optimize the pre-trained model is a pivotal issue. Moreover, despite the wealth of benchmarks and studies in the natural language community, there remains a lack of a comprehensive benchmark for systematically evaluating protein language model quality. Given these challenges, PETA trained language models with 14 different vocabulary sizes under three tokenization methods. It conducted thousands of tests on 33 diverse downstream datasets to assess the models' transfer learning capabilities, incorporating two classification heads and three random seeds to mitigate potential biases. Extensive experiments indicate that vocabulary sizes between 50 and 200 optimize the model, whereas sizes exceeding 800 detrimentally affect the model's representational performance. Our code, model weights and datasets are available at https://github.com/ginnm/ProteinPretraining.

downstream application, protein transfer learning, sub-word tokenization, (1 more...)

arXiv.org Artificial Intelligence

2310.17415

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.60)

Add feedback

TIES-Merging: Resolving Interference When Merging Models

Yadav, Prateek, Tam, Derek, Choshen, Leshem, Raffel, Colin, Bansal, Mohit

arXiv.org Artificial IntelligenceOct-26-2023

Transfer learning - i.e., further fine-tuning a pre-trained model on a downstream task - can confer significant advantages, including improved downstream performance, faster convergence, and better sample efficiency. These advantages have led to a proliferation of task-specific fine-tuned models, which typically can only perform a single task and do not benefit from one another. Recently, model merging techniques have emerged as a solution to combine multiple task-specific models into a single multitask model without performing additional training. However, existing merging methods often ignore the interference between parameters of different models, resulting in large performance drops when merging multiple models. In this paper, we demonstrate that prior merging techniques inadvertently lose valuable information due to two major sources of interference: (a) interference due to redundant parameter values and (b) disagreement on the sign of a given parameter's values across models. To address this, we propose our method, TRIM, ELECT SIGN & MERGE (TIES-Merging), which introduces three novel steps when merging models: (1) resetting parameters that only changed a small amount during fine-tuning, (2) resolving sign conflicts, and (3) merging only the parameters that are in alignment with the final agreed-upon sign. We find that TIES-Merging outperforms several existing methods in diverse settings covering a range of modalities, domains, number of tasks, model sizes, architectures, and fine-tuning settings. We further analyze the impact of different types of interference on model parameters, and highlight the importance of resolving sign interference. Our code is available at https://github.com/prateeky2806/ties-merging

erging, interference, vector, (14 more...)

arXiv.org Artificial Intelligence

2306.01708

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States > North Carolina (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.34)

Add feedback

PeFLL: Personalized Federated Learning by Learning to Learn

Scott, Jonathan, Zakerinia, Hossein, Lampert, Christoph H.

arXiv.org Artificial IntelligenceOct-25-2023

We present PeFLL, a new personalized federated learning algorithm that improves over the state-of-the-art in three aspects: 1) it produces more accurate models, especially in the low-data regime, and not only for clients present during its training phase, but also for any that may emerge in the future; 2) it reduces the amount of on-client computation and client-server communication by providing future clients with ready-to-use personalized models that require no additional finetuning or optimization; 3) it comes with theoretical guarantees that establish generalization from the observed clients to future ones. At the core of PeFLL lies a learning-to-learn approach that jointly trains an embedding network and a hypernetwork. The embedding network is used to represent clients in a latent descriptor space in a way that reflects their similarity to each other. The hypernetwork takes as input such descriptors and outputs the parameters of fully personalized client models. In combination, both networks constitute a learning algorithm that achieves state-of-the-art performance in several personalized federated learning benchmarks.

learning, pefll, personalized federated learning

arXiv.org Artificial Intelligence

2306.05515

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.60)

Add feedback

Cross-Domain HAR: Few Shot Transfer Learning for Human Activity Recognition

Thukral, Megha, Haresamudram, Harish, Ploetz, Thomas

arXiv.org Artificial IntelligenceOct-22-2023

The ubiquitous availability of smartphones and smartwatches with integrated inertial measurement units (IMUs) enables straightforward capturing of human activities. For specific applications of sensor based human activity recognition (HAR), however, logistical challenges and burgeoning costs render especially the ground truth annotation of such data a difficult endeavor, resulting in limited scale and diversity of datasets. Transfer learning, i.e., leveraging publicly available labeled datasets to first learn useful representations that can then be fine-tuned using limited amounts of labeled data from a target domain, can alleviate some of the performance issues of contemporary HAR systems. Yet they can fail when the differences between source and target conditions are too large and/ or only few samples from a target application domain are available, each of which are typical challenges in real-world human activity recognition scenarios. In this paper, we present an approach for economic use of publicly available labeled HAR datasets for effective transfer learning. We introduce a novel transfer learning framework, Cross-Domain HAR, which follows the teacher-student self-training paradigm to more effectively recognize activities with very limited label information. It bridges conceptual gaps between source and target domains, including sensor locations and type of activities. Through our extensive experimental evaluation on a range of benchmark datasets, we demonstrate the effectiveness of our approach for practically relevant few shot activity recognition scenarios. We also present a detailed analysis into how the individual components of our framework affect downstream performance.

dataset, source dataset, target dataset, (12 more...)

arXiv.org Artificial Intelligence

2310.1439

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > Canada > Newfoundland and Labrador > Labrador (0.04)

Genre: Research Report (1.00)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback