Transfer Learning
Model Recycling Framework for Multi-Source Data-Free Supervised Transfer Learning
This situation can give rise to privacy concerns, as organizations may not want to share sensitive information; for instance, healthcare providers may be reluctant to share patient information and security system maintainers may not want to risk sharing facial recognition data for system performance updates. Additionally, there may be issues with obtaining the source data such as when it is hard to retrieve due to technical difficulties or intellectual property restrictions (Li et al., 2020b; Chen et al., 2021; Liang et al., 2020; Ahmed et al., 2021b). Recent advancements in source-free unsupervised domain adaptation (SFUDA) have presented solutions for a scenario where source data is not accessible (Fang et al., 2022). Purposely, SFUDA utilizes pre-trained source models to improve the generalization of a model on an unlabeled target dataset. Our work is similar to other approaches in the field of SFUDA (Li et al., 2020b; Chen et al., 2021; Liang et al., 2020; Ahmed et al., 2021b), in that it addresses the practical scenario where source data is not available during training. Importantly, a crucial aspect is often overlooked by the majority of SFUDA studies. When it is assumed that source data is not accessible, then it cannot be guaranteed that the available source models have been trained on domains related to the target task. And yet, most of the works only have experimented on classic domain adaptation benchmarks, which are somewhat related by design, e.g., Digits-Five (Peng et al., 2019), Office-31 (Saenko et al., 2010), and Office-Home (Venkateswara et al., 2017), i.e,, domains that share the same labels but are dissimilar in feature (and ambient) space. Our approach is unique in that we consider such a source-free supervised transfer learning (SFSTL) setting (Lee et al., 2019), where we do not assume source models are trained on tasks with similar feature spaces or 1
DiFuse-Net: RGB and Dual-Pixel Depth Estimation using Window Bi-directional Parallax Attention and Cross-modal Transfer Learning
Swami, Kunal, Gupta, Debtanu, Muduli, Amrit Kumar, Jaiswal, Chirag, Bajpai, Pankaj Kumar
-- Depth estimation is crucial for intelligent systems, enabling applications from autonomous navigation to augmented reality. While traditional stereo and active depth sensors have limitations in cost, power, and robustness, dual-pixel (DP) technology, ubiquitous in modern cameras, offers a compelling alternative. This paper introduces DiFuse-Net, a novel modality decoupled network design for disentangled RGB and DP based depth estimation. DiFuse-Net features a window bi-directional parallax attention mechanism (WBiPAM) specifically designed to capture the subtle DP disparity cues unique to smartphone cameras with small aperture. A separate encoder extracts contextual information from the RGB image, and these features are fused to enhance depth prediction. We also propose a Cross-modal Transfer Learning (CmTL) mechanism to utilize large-scale RGB-D datasets in the literature to cope with the limitations of obtaining large-scale RGB-DP-D dataset. Our evaluation and comparison of the proposed method demonstrates its superiority over the DP and stereo-based baseline methods. Additionally, we contribute a new, high-quality, real-world RGB-DP-D training dataset, named Dual-Camera Dual-Pixel (DCDP) dataset, created using our novel symmetric stereo camera hardware setup, stereo calibration and rectification protocol, and AI stereo disparity estimation method.
Formal Bayesian Transfer Learning via the Total Risk Prior
Wycoff, Nathan, Arab, Ali, Singh, Lisa O.
In analyses with severe data-limitations, augmenting the target dataset with information from ancillary datasets in the application domain, called source datasets, can lead to significantly improved statistical procedures. However, existing methods for this transfer learning struggle to deal with situations where the source datasets are also limited and not guaranteed to be well-aligned with the target dataset. A typical strategy is to use the empirical loss minimizer on the source data as a prior mean for the target parameters, which places the estimation of source parameters outside of the Bayesian formalism. Our key conceptual contribution is to use a risk minimizer conditional on source parameters instead. This allows us to construct a single joint prior distribution for all parameters from the source datasets as well as the target dataset. As a consequence, we benefit from full Bayesian uncertainty quantification and can perform model averaging via Gibbs sampling over indicator variables governing the inclusion of each source dataset. We show how a particular instantiation of our prior leads to a Bayesian Lasso in a transformed coordinate system and discuss computational techniques to scale our approach to moderately sized datasets. We also demonstrate that recently proposed minimax-frequentist transfer learning techniques may be viewed as an approximate Maximum a Posteriori approach to our model. Finally, we demonstrate superior predictive performance relative to the frequentist baseline on a genetics application, especially when the source data are limited.
Physics-informed transfer learning for SHM via feature selection
Poole, J., Gardner, P., Hughes, A. J., Dervilis, N., Mills, R. S., Dardeno, T. A., Worden, K.
Data used for training structural health monitoring (SHM) systems are expensive and often impractical to obtain, particularly labelled data. Population-based SHM presents a potential solution to this issue by considering the available data across a population of structures. However, differences between structures will mean the training and testing distributions will differ; thus, conventional machine learning methods cannot be expected to generalise between structures. To address this issue, transfer learning (TL), can be used to leverage information across related domains. An important consideration is that the lack of labels in the target domain limits data-based metrics to quantifying the discrepancy between the marginal distributions. Thus, a prerequisite for the application of typical unsupervised TL methods is to identify suitable source structures (domains), and a set of features, for which the conditional distributions are related to the target structure. Generally, the selection of domains and features is reliant on domain expertise; however, for complex mechanisms, such as the influence of damage on the dynamic response of a structure, this task is not trivial. In this paper, knowledge of physics is leveraged to select more similar features, the modal assurance criterion (MAC) is used to quantify the correspondence between the modes of healthy structures. The MAC is shown to have high correspondence with a supervised metric that measures joint-distribution similarity, which is the primary indicator of whether a classifier will generalise between domains. The MAC is proposed as a measure for selecting a set of features that behave consistently across domains when subjected to damage, i.e. features with invariance in the conditional distributions. This approach is demonstrated on numerical and experimental case studies to verify its effectiveness in various applications.
UWB Radar-based Heart Rate Monitoring: A Transfer Learning Approach
Gruzewska, Elzbieta, Rao, Pooja, Baur, Sebastien, Baugh, Matthew, Bellaiche, Mathias M. J., Srinivas, Sharanya, Ponce, Octavio, Thompson, Matthew, Rudrapatna, Pramod, Sanchez, Michael A., Cai, Lawrence Z., Chico, Timothy JA, Storey, Robert F., Maz, Emily, Telang, Umesh, Shetty, Shravya, Daswani, Mayank
Radar technology presents untapped potential for continuous, contactless, and passive heart rate monitoring via consumer electronics like mobile phones. However the variety of available radar systems and lack of standardization means that a large new paired dataset collection is required for each radar system. This study demonstrates transfer learning between frequency-modulated continuous wave (FMCW) and impulse-radio ultra-wideband (IR-UWB) radar systems, both increasingly integrated into consumer devices. FMCW radar utilizes a continuous chirp, while IR-UWB radar employs short pulses. Our mm-wave FMCW radar operated at 60 GHz with a 5.5 GHz bandwidth (2.7 cm resolution, 3 receiving antennas [Rx]), and our IR-UWB radar at 8 GHz with a 500 MHz bandwidth (30 cm resolution, 2 Rx). Using a novel 2D+1D ResNet architecture we achieved a mean absolute error (MAE) of 0.85 bpm and a mean absolute percentage error (MAPE) of 1.42% for heart rate monitoring with FMCW radar (N=119 participants, an average of 8 hours per participant). This model maintained performance (under 5 MAE/10% MAPE) across various body positions and heart rate ranges, with a 98.9% recall. We then fine-tuned a variant of this model, trained on single-antenna and single-range bin FMCW data, using a small (N=376, avg 6 minutes per participant) IR-UWB dataset. This transfer learning approach yielded a model with MAE 4.1 bpm and MAPE 6.3% (97.5% recall), a 25% MAE reduction over the IR-UWB baseline. This demonstration of transfer learning between radar systems for heart rate monitoring has the potential to accelerate its introduction into existing consumer devices.
Coefficient Shape Transfer Learning for Functional Linear Regression
Jiao, Shuhao, Mckeague, Ian W., Chan, N. -H.
In this paper, we develop a novel transfer learning methodology to tackle the challenge of data scarcity in functional linear models. The methodology incorporates samples from the target model (target domain) alongside those from auxiliary models (source domains), transferring knowledge of coefficient shape from the source domains to the target domain. This shape-based knowledge transfer offers two key advantages. First, it is robust to covariate scaling, ensuring effectiveness despite variations in data distributions across different source domains. Second, the notion of coefficient shape homogeneity represents a meaningful advance beyond traditional coefficient homogeneity, allowing the method to exploit a wider range of source domains and achieve significantly improved model estimation. We rigorously analyze the convergence rates of the proposed estimator and examine the minimax optimality. Our findings show that the degree of improvement depends not only on the similarity of coefficient shapes between the target and source domains, but also on coefficient magnitudes and the spectral decay rates of the functional covariates covariance operators. To address situations where only a subset of auxiliary models is informative for the target model, we further develop a data-driven procedure for identifying such informative sources. The effectiveness of the proposed methodology is demonstrated through comprehensive simulation studies and an application to occupation time analysis using physical activity data from the U.S. National Health and Nutrition Examination Survey.
A Survey on Prompt Tuning
Li, Zongqian, Su, Yixuan, Collier, Nigel
This survey reviews prompt tuning, a parameter-efficient approach for adapting language models by prepending trainable continuous vectors while keeping the model frozen. We classify existing approaches into two categories: direct prompt learning and transfer learning. Direct prompt learning methods include: general optimization approaches, encoder-based methods, decomposition strategies, and mixture-of-experts frameworks. Transfer learning methods consist of: general transfer approaches, encoder-based methods, and decomposition strategies. For each method, we analyze method designs, innovations, insights, advantages, and disadvantages, with illustrative visualizations comparing different frameworks. We identify challenges in computational efficiency and training stability, and discuss future directions in improving training robustness and broadening application scope.
DS@GT at CheckThat! 2025: Detecting Subjectivity via Transfer-Learning and Corrective Data Augmentation
Heil, Maximilian, Bang, Dionne
This paper presents our submission to Task 1, Subjectivity Detection, of the CheckThat! Lab at CLEF 2025. We investigate the effectiveness of transfer-learning and stylistic data augmentation to improve classification of subjective and objective sentences in English news text. Our approach contrasts fine-tuning of pre-trained encoders and transfer-learning of fine-tuned transformer on related tasks. We also introduce a controlled augmentation pipeline using GPT-4o to generate paraphrases in predefined subjectivity styles. To ensure label and style consistency, we employ the same model to correct and refine the generated samples. Results show that transfer-learning of specified encoders outperforms fine-tuning general-purpose ones, and that carefully curated augmentation significantly enhances model robustness, especially in detecting subjective content. Our official submission placed us $16^{th}$ of 24 participants. Overall, our findings underscore the value of combining encoder specialization with label-consistent augmentation for improved subjectivity detection. Our code is available at https://github.com/dsgt-arc/checkthat-2025-subject.
PSAT: Pediatric Segmentation Approaches via Adult Augmentations and Transfer Learning
Kirscher, Tristan, Faisan, Sylvain, Coubez, Xavier, Barrier, Loris, Meyer, Philippe
Pediatric medical imaging presents unique challenges due to significant anatomical and developmental differences compared to adults. Direct application of segmentation models trained on adult data often yields suboptimal performance, particularly for small or rapidly evolving structures. To address these challenges, several strategies leveraging the nnU-Net framework have been proposed, differing along four key axes: (i) the fingerprint dataset (adult, pediatric, or a combination thereof) from which the Training Plan -- including the network architecture--is derived; (ii) the Learning Set (adult, pediatric, or mixed), (iii) Data Augmentation parameters, and (iv) the Transfer learning method (fine-tuning versus continual learning). In this work, we introduce PSAT (Pediatric Segmentation Approaches via Adult Augmentations and Transfer learning), a systematic study that investigates the impact of these axes on segmentation performance. We benchmark the derived strategies on two pediatric CT datasets and compare them with state-of-the-art methods, including a commercial radiotherapy solution. PSAT highlights key pitfalls and provides actionable insights for improving pediatric segmentation. Our experiments reveal that a training plan based on an adult fingerprint dataset is misaligned with pediatric anatomy--resulting in significant performance degradation, especially when segmenting fine structures--and that continual learning strategies mitigate institutional shifts, thus enhancing generalization across diverse pediatric datasets.
Understanding Knowledge Transferability for Transfer Learning: A Survey
Wang, Haohua, Wang, Jingge, Zhao, Zijie, Tan, Yang, Wu, Yanru, Liu, Hanbing, Yang, Jingyun, Zhang, Enming, Chen, Xiangyu, Rong, Zhengze, Guo, Shanxin, Li, Yang
Transfer learning has become an essential paradigm in artificial intelligence, enabling the transfer of knowledge from a source task to improve performance on a target task. This approach, particularly through techniques such as pretraining and fine-tuning, has seen significant success in fields like computer vision and natural language processing. However, despite its widespread use, how to reliably assess the transferability of knowledge remains a challenge. Understanding the theoretical underpinnings of each transferability metric is critical for ensuring the success of transfer learning. In this survey, we provide a unified taxonomy of transferability metrics, categorizing them based on transferable knowledge types and measurement granularity. This work examines the various metrics developed to evaluate the potential of source knowledge for transfer learning and their applicability across different learning paradigms emphasizing the need for careful selection of these metrics. By offering insights into how different metrics work under varying conditions, this survey aims to guide researchers and practitioners in selecting the most appropriate metric for specific applications, contributing to more efficient, reliable, and trustworthy AI systems. Finally, we discuss some open challenges in this field and propose future research directions to further advance the application of transferability metrics in trustworthy transfer learning.