AITopics | Transfer Learning

Collaborating Authors

Transfer Learning

Transfer Learning is the reuse of a pre-trained model on a new problem. (Towards Data Science)

News Overviews Instructional Materials AI-Alerts Classics

Active Learning and Transfer Learning for Anomaly Detection in Time-Series Data

Kelleher, John D., Nicholson, Matthew, Agrahari, Rahul, Conran, Clare

arXiv.org Artificial IntelligenceAug-7-2025

This paper examines the effectiveness of combining active learning and transfer learning for anomaly detection in cross-domain time-series data. Our results indicate that there is an interaction between clustering and active learning and in general the best performance is achieved using a single cluster (in other words when clustering is not applied). Also, we find that adding new samples to the training set using active learning does improve model performance but that in general, the rate of improvement is slower than the results reported in the literature suggest. We attribute this difference to an improved experimental design where distinct data samples are used for the sampling and testing pools. Finally, we assess the ceiling performance of transfer learning in combination with active learning across several datasets and find that performance does initially improve but eventually begins to tail off as more target points are selected for inclusion in training. This tail-off in performance may indicate that the active learning process is doing a good job of sequencing data points for selection, pushing the less useful points towards the end of the selection process and that this tail-off occurs when these less useful points are eventually added. Taken together our results indicate that active learning is effective but that the improvement in model performance follows a linear flat function concerning the number of points selected and labelled.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2508.03921

Country:

Europe > Ireland (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.55)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)

Add feedback

Model Recycling Framework for Multi-Source Data-Free Supervised Transfer Learning

Wang, Sijia, Henao, Ricardo

arXiv.org Machine LearningAug-5-2025

This situation can give rise to privacy concerns, as organizations may not want to share sensitive information; for instance, healthcare providers may be reluctant to share patient information and security system maintainers may not want to risk sharing facial recognition data for system performance updates. Additionally, there may be issues with obtaining the source data such as when it is hard to retrieve due to technical difficulties or intellectual property restrictions (Li et al., 2020b; Chen et al., 2021; Liang et al., 2020; Ahmed et al., 2021b). Recent advancements in source-free unsupervised domain adaptation (SFUDA) have presented solutions for a scenario where source data is not accessible (Fang et al., 2022). Purposely, SFUDA utilizes pre-trained source models to improve the generalization of a model on an unlabeled target dataset. Our work is similar to other approaches in the field of SFUDA (Li et al., 2020b; Chen et al., 2021; Liang et al., 2020; Ahmed et al., 2021b), in that it addresses the practical scenario where source data is not available during training. Importantly, a crucial aspect is often overlooked by the majority of SFUDA studies. When it is assumed that source data is not accessible, then it cannot be guaranteed that the available source models have been trained on domains related to the target task. And yet, most of the works only have experimented on classic domain adaptation benchmarks, which are somewhat related by design, e.g., Digits-Five (Peng et al., 2019), Office-31 (Saenko et al., 2010), and Office-Home (Venkateswara et al., 2017), i.e,, domains that share the same labels but are dissimilar in feature (and ambient) space. Our approach is unique in that we consider such a source-free supervised transfer learning (SFSTL) setting (Lee et al., 2019), where we do not assume source models are trained on tasks with similar feature spaces or 1

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Machine Learning

2508.02039

Country:

Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Europe > Greece (0.04)
Europe > Czechia > Prague (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

DiFuse-Net: RGB and Dual-Pixel Depth Estimation using Window Bi-directional Parallax Attention and Cross-modal Transfer Learning

Swami, Kunal, Gupta, Debtanu, Muduli, Amrit Kumar, Jaiswal, Chirag, Bajpai, Pankaj Kumar

arXiv.org Artificial IntelligenceAug-4-2025

-- Depth estimation is crucial for intelligent systems, enabling applications from autonomous navigation to augmented reality. While traditional stereo and active depth sensors have limitations in cost, power, and robustness, dual-pixel (DP) technology, ubiquitous in modern cameras, offers a compelling alternative. This paper introduces DiFuse-Net, a novel modality decoupled network design for disentangled RGB and DP based depth estimation. DiFuse-Net features a window bi-directional parallax attention mechanism (WBiPAM) specifically designed to capture the subtle DP disparity cues unique to smartphone cameras with small aperture. A separate encoder extracts contextual information from the RGB image, and these features are fused to enhance depth prediction. We also propose a Cross-modal Transfer Learning (CmTL) mechanism to utilize large-scale RGB-D datasets in the literature to cope with the limitations of obtaining large-scale RGB-DP-D dataset. Our evaluation and comparison of the proposed method demonstrates its superiority over the DP and stereo-based baseline methods. Additionally, we contribute a new, high-quality, real-world RGB-DP-D training dataset, named Dual-Camera Dual-Pixel (DCDP) dataset, created using our novel symmetric stereo camera hardware setup, stereo calibration and rectification protocol, and AI stereo disparity estimation method.

artificial intelligence, image understanding, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2506.14709

Country: Asia (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.95)
(2 more...)

Add feedback

Formal Bayesian Transfer Learning via the Total Risk Prior

Wycoff, Nathan, Arab, Ali, Singh, Lisa O.

arXiv.org Machine LearningAug-1-2025

In analyses with severe data-limitations, augmenting the target dataset with information from ancillary datasets in the application domain, called source datasets, can lead to significantly improved statistical procedures. However, existing methods for this transfer learning struggle to deal with situations where the source datasets are also limited and not guaranteed to be well-aligned with the target dataset. A typical strategy is to use the empirical loss minimizer on the source data as a prior mean for the target parameters, which places the estimation of source parameters outside of the Bayesian formalism. Our key conceptual contribution is to use a risk minimizer conditional on source parameters instead. This allows us to construct a single joint prior distribution for all parameters from the source datasets as well as the target dataset. As a consequence, we benefit from full Bayesian uncertainty quantification and can perform model averaging via Gibbs sampling over indicator variables governing the inclusion of each source dataset. We show how a particular instantiation of our prior leads to a Bayesian Lasso in a transformed coordinate system and discuss computational techniques to scale our approach to moderately sized datasets. We also demonstrate that recently proposed minimax-frequentist transfer learning techniques may be viewed as an approximate Maximum a Posteriori approach to our model. Finally, we demonstrate superior predictive performance relative to the frequentist baseline on a genetics application, especially when the source data are limited.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2507.23768

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.93)

Add feedback

Physics-informed transfer learning for SHM via feature selection

Poole, J., Gardner, P., Hughes, A. J., Dervilis, N., Mills, R. S., Dardeno, T. A., Worden, K.

arXiv.org Artificial IntelligenceJul-29-2025

Data used for training structural health monitoring (SHM) systems are expensive and often impractical to obtain, particularly labelled data. Population-based SHM presents a potential solution to this issue by considering the available data across a population of structures. However, differences between structures will mean the training and testing distributions will differ; thus, conventional machine learning methods cannot be expected to generalise between structures. To address this issue, transfer learning (TL), can be used to leverage information across related domains. An important consideration is that the lack of labels in the target domain limits data-based metrics to quantifying the discrepancy between the marginal distributions. Thus, a prerequisite for the application of typical unsupervised TL methods is to identify suitable source structures (domains), and a set of features, for which the conditional distributions are related to the target structure. Generally, the selection of domains and features is reliant on domain expertise; however, for complex mechanisms, such as the influence of damage on the dynamic response of a structure, this task is not trivial. In this paper, knowledge of physics is leveraged to select more similar features, the modal assurance criterion (MAC) is used to quantify the correspondence between the modes of healthy structures. The MAC is shown to have high correspondence with a supervised metric that measures joint-distribution similarity, which is the primary indicator of whether a classifier will generalise between domains. The MAC is proposed as a measure for selecting a set of features that behave consistently across domains when subjected to damage, i.e. features with invariance in the conditional distributions. This approach is demonstrated on numerical and experimental case studies to verify its effectiveness in various applications.

artificial intelligence, evolutionary algorithm, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2507.19519

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Materials > Construction Materials (0.45)
Energy (0.45)
Health & Medicine > Consumer Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
(3 more...)

Add feedback

UWB Radar-based Heart Rate Monitoring: A Transfer Learning Approach

Gruzewska, Elzbieta, Rao, Pooja, Baur, Sebastien, Baugh, Matthew, Bellaiche, Mathias M. J., Srinivas, Sharanya, Ponce, Octavio, Thompson, Matthew, Rudrapatna, Pramod, Sanchez, Michael A., Cai, Lawrence Z., Chico, Timothy JA, Storey, Robert F., Maz, Emily, Telang, Umesh, Shetty, Shravya, Daswani, Mayank

arXiv.org Artificial IntelligenceJul-22-2025

Radar technology presents untapped potential for continuous, contactless, and passive heart rate monitoring via consumer electronics like mobile phones. However the variety of available radar systems and lack of standardization means that a large new paired dataset collection is required for each radar system. This study demonstrates transfer learning between frequency-modulated continuous wave (FMCW) and impulse-radio ultra-wideband (IR-UWB) radar systems, both increasingly integrated into consumer devices. FMCW radar utilizes a continuous chirp, while IR-UWB radar employs short pulses. Our mm-wave FMCW radar operated at 60 GHz with a 5.5 GHz bandwidth (2.7 cm resolution, 3 receiving antennas [Rx]), and our IR-UWB radar at 8 GHz with a 500 MHz bandwidth (30 cm resolution, 2 Rx). Using a novel 2D+1D ResNet architecture we achieved a mean absolute error (MAE) of 0.85 bpm and a mean absolute percentage error (MAPE) of 1.42% for heart rate monitoring with FMCW radar (N=119 participants, an average of 8 hours per participant). This model maintained performance (under 5 MAE/10% MAPE) across various body positions and heart rate ranges, with a 98.9% recall. We then fine-tuned a variant of this model, trained on single-antenna and single-range bin FMCW data, using a small (N=376, avg 6 minutes per participant) IR-UWB dataset. This transfer learning approach yielded a model with MAE 4.1 bpm and MAPE 6.3% (97.5% recall), a 25% MAE reduction over the IR-UWB baseline. This demonstration of transfer learning between radar systems for heart rate monitoring has the potential to accelerate its introduction into existing consumer devices.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.14195

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Coefficient Shape Transfer Learning for Functional Linear Regression

Jiao, Shuhao, Mckeague, Ian W., Chan, N. -H.

arXiv.org Machine LearningJul-11-2025

In this paper, we develop a novel transfer learning methodology to tackle the challenge of data scarcity in functional linear models. The methodology incorporates samples from the target model (target domain) alongside those from auxiliary models (source domains), transferring knowledge of coefficient shape from the source domains to the target domain. This shape-based knowledge transfer offers two key advantages. First, it is robust to covariate scaling, ensuring effectiveness despite variations in data distributions across different source domains. Second, the notion of coefficient shape homogeneity represents a meaningful advance beyond traditional coefficient homogeneity, allowing the method to exploit a wider range of source domains and achieve significantly improved model estimation. We rigorously analyze the convergence rates of the proposed estimator and examine the minimax optimality. Our findings show that the degree of improvement depends not only on the similarity of coefficient shapes between the target and source domains, but also on coefficient magnitudes and the spectral decay rates of the functional covariates covariance operators. To address situations where only a subset of auxiliary models is informative for the target model, we further develop a data-driven procedure for identifying such informative sources. The effectiveness of the proposed methodology is demonstrated through comprehensive simulation studies and an application to occupation time analysis using physical activity data from the U.S. National Health and Nutrition Examination Survey.

artificial intelligence, machine learning, source domain, (15 more...)

arXiv.org Machine Learning

2506.11367

Country:

North America > United States (0.28)
Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.64)

Add feedback

A Survey on Prompt Tuning

Li, Zongqian, Su, Yixuan, Collier, Nigel

arXiv.org Artificial IntelligenceJul-10-2025

This survey reviews prompt tuning, a parameter-efficient approach for adapting language models by prepending trainable continuous vectors while keeping the model frozen. We classify existing approaches into two categories: direct prompt learning and transfer learning. Direct prompt learning methods include: general optimization approaches, encoder-based methods, decomposition strategies, and mixture-of-experts frameworks. Transfer learning methods consist of: general transfer approaches, encoder-based methods, and decomposition strategies. For each method, we analyze method designs, innovations, insights, advantages, and disadvantages, with illustrative visualizations comparing different frameworks. We identify challenges in computational efficiency and training stability, and discuss future directions in improving training robustness and broadening application scope.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.06085

Country:

North America > Canada (0.14)
Asia > Middle East > UAE (0.14)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

DS@GT at CheckThat! 2025: Detecting Subjectivity via Transfer-Learning and Corrective Data Augmentation

Heil, Maximilian, Bang, Dionne

arXiv.org Artificial IntelligenceJul-9-2025

This paper presents our submission to Task 1, Subjectivity Detection, of the CheckThat! Lab at CLEF 2025. We investigate the effectiveness of transfer-learning and stylistic data augmentation to improve classification of subjective and objective sentences in English news text. Our approach contrasts fine-tuning of pre-trained encoders and transfer-learning of fine-tuned transformer on related tasks. We also introduce a controlled augmentation pipeline using GPT-4o to generate paraphrases in predefined subjectivity styles. To ensure label and style consistency, we employ the same model to correct and refine the generated samples. Results show that transfer-learning of specified encoders outperforms fine-tuning general-purpose ones, and that carefully curated augmentation significantly enhances model robustness, especially in detecting subjective content. Our official submission placed us $16^{th}$ of 24 participants. Overall, our findings underscore the value of combining encoder specialization with label-consistent augmentation for improved subjectivity detection. Our code is available at https://github.com/dsgt-arc/checkthat-2025-subject.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2507.06189

Country:

Europe (0.94)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Media (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

PSAT: Pediatric Segmentation Approaches via Adult Augmentations and Transfer Learning

Kirscher, Tristan, Faisan, Sylvain, Coubez, Xavier, Barrier, Loris, Meyer, Philippe

arXiv.org Artificial IntelligenceJul-9-2025

Pediatric medical imaging presents unique challenges due to significant anatomical and developmental differences compared to adults. Direct application of segmentation models trained on adult data often yields suboptimal performance, particularly for small or rapidly evolving structures. To address these challenges, several strategies leveraging the nnU-Net framework have been proposed, differing along four key axes: (i) the fingerprint dataset (adult, pediatric, or a combination thereof) from which the Training Plan -- including the network architecture--is derived; (ii) the Learning Set (adult, pediatric, or mixed), (iii) Data Augmentation parameters, and (iv) the Transfer learning method (fine-tuning versus continual learning). In this work, we introduce PSAT (Pediatric Segmentation Approaches via Adult Augmentations and Transfer learning), a systematic study that investigates the impact of these axes on segmentation performance. We benchmark the derived strategies on two pediatric CT datasets and compare them with state-of-the-art methods, including a commercial radiotherapy solution. PSAT highlights key pitfalls and provides actionable insights for improving pediatric segmentation. Our experiments reveal that a training plan based on an adult fingerprint dataset is misaligned with pediatric anatomy--resulting in significant performance degradation, especially when segmenting fine structures--and that continual learning strategies mitigate institutional shifts, thus enhancing generalization across diverse pediatric datasets.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2507.05764

Country: Europe > France (0.15)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback