Transfer Learning
Query-based Knowledge Transfer for Heterogeneous Learning Environments
Alballa, Norah, Zhang, Wenxuan, Liu, Ziquan, Abdelmoniem, Ahmed M., Elhoseiny, Mohamed, Canini, Marco
However, existing solutions like federated learning, ensembles, and transfer learning, often fail to adequately serve the unique needs of clients, especially when local data representation is limited. To address this issue, we propose a novel framework called Query-based Knowledge Transfer (QKT) that enables tailored knowledge acquisition to fulfill specific client needs without direct data exchange. QKT employs a data-free masking strategy to facilitate communication-efficient query-focused knowledge transfer while refining task-specific parameters to mitigate knowledge interference and forgetting. Our experiments, conducted on both standard and clinical benchmarks, show that QKT significantly outperforms existing collaborative learning methods by an average of 20.91% points in single-class query settings and an average of 14.32% points in multi-class query scenarios. Further analysis and ablation studies reveal that QKT effectively balances the learning of new and existing knowledge, showing strong potential for its application in decentralized learning. However, the rapid proliferation of Internet of Things (IoT) devices and the increasingly stringent data privacy regulations have highlighted the need for a decentralized machine learning framework. This framework allows models to be trained locally on devices or within organizations and encourages knowledge transfer between models in the network of clients without exchanging raw data. Despite its potential, the decentralized paradigm faces substantial challenges, particularly in addressing the diverse needs of devices and clients in heterogeneous environments. In heterogeneous environments, each client may have vastly different local data distributions, resulting in diverse query objectives that might be out of the local distribution but relevant to other clients. For instance, in medical diagnostics, models may be required to detect rare or emerging diseases that are underrepresented locally, necessitating the ability to generalize from similar conditions observed in other regions or populations. Similarly, in fraud detection, the constantly evolving nature of fraudulent activities means that new tactics may not yet be captured in the historical data of certain clients. Consequently, it is helpful for models to rapidly learn from fraud patterns detected elsewhere to remain effective. Previous work has offered valuable solutions to this challenge, but each comes with its own limitations. Collaborative methods like Federated Learning (FL) (McMahan et al., 2017) aggregate knowledge across clients but often struggle to adapt models to the specific needs of individual clients.
Sparse Optimization for Transfer Learning: A L0-Regularized Framework for Multi-Source Domain Adaptation
This paper explores transfer learning in heterogeneous multi-source environments with distributional divergence between target and auxiliary domains. To address challenges in statistical bias and computational efficiency, we propose a Sparse Optimization for Transfer Learning (SOTL) framework based on L0-regularization. The method extends the Joint Estimation Transferred from Strata (JETS) paradigm with two key innovations: (1) L0-constrained exact sparsity for parameter space compression and complexity reduction, and (2) refining optimization focus to emphasize target parameters over redundant ones. Simulations show that SOTL significantly improves both estimation accuracy and computational speed, especially under adversarial auxiliary domain conditions. Empirical validation on the Community and Crime benchmarks demonstrates the statistical robustness of the SOTL method in cross-domain transfer.
Privacy-Preserving Transfer Learning for Community Detection using Locally Distributed Multiple Networks
Guo, Xiao, He, Xuming, Chang, Xiangyu, Ma, Shujie
This paper develops a new spectral clustering-based method called TransNet for transfer learning in community detection of network data. Our goal is to improve the clustering performance of the target network using auxiliary source networks, which are heterogeneous, privacy-preserved, and locally stored across various sources. The edges of each locally stored network are perturbed using the randomized response mechanism to achieve differential privacy. Notably, we allow the source networks to have distinct privacy-preserving and heterogeneity levels as often desired in practice. To better utilize the information from the source networks, we propose a novel adaptive weighting method to aggregate the eigenspaces of the source networks multiplied by adaptive weights chosen to incorporate the effects of privacy and heterogeneity. We propose a regularization method that combines the weighted average eigenspace of the source networks with the eigenspace of the target network to achieve an optimal balance between them. Theoretically, we show that the adaptive weighting method enjoys the error-bound-oracle property in the sense that the error bound of the estimated eigenspace only depends on informative source networks. We also demonstrate that TransNet performs better than the estimator using only the target network and the estimator using only the weighted source networks.
Nonhuman Primate Brain Tissue Segmentation Using a Transfer Learning Approach
Lin, Zhen, Yuan, Hongyu, Barcus, Richard, Lyu, Qing, Chakravarty, Sucheta, Lipford, Megan E., Shively, Carol A., Craft, Suzanne, Kawas, Mohammad, Kim, Jeongchul, Whitlow, Christopher T.
Non - human primates (NHPs) serve as critical models for understanding human brain function and neurological disorders due to their close evolutionary relationship with humans. Accurate brain tissue segmentation in NHPs is critical for understanding neurolog ical disorders, but challenging due to the scarcity of annotated NHP brain MRI datasets, the small size of the NHP brain, the limited resolution of available imaging data and the anatomical differences between human and NHP brains. To address these challen ges, we propose a novel approach utilizing ST U - Net with transfer learning to leverage knowledge transferred from human brain MRI data to enhance segmentation accuracy in the NHP brain MRI, particularly when training data is limited. Specifically, we first train our STU - N et model on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, allowing our model to learn generalizable features of human brain anatomy. This model is then fine - tuned on a small dataset of vervet brain MRI from The Aging Vervet Colony (AVC) at Wake Forest Alzheimer's Disease Research Center (ADRC) to adapt to the NHP - specific neuroanatomy. This enables accurate segmentation of six key tissue types: grey matter (GM), white matter (WM), CSF, deep grey matter, brainstem, and cerebellum. The combination of STU - N et and transfer learning effectively delineates complex tissue boundaries and captures fine anatomical details specific to NHP brains. Notably, our method demonstrated improvement in segmenting small subcortical structures suc h as putamen and thalamus that are challenging to resolve with limited spatial resolution and tissue contrast, and achieved DSC of over 0.88, IoU over 0.8 and HD95 under 7. This study introduces a robust method for multi - class brain tissue segmentation in NHPs, potentially accelerating research in evolutionary neuroscience and preclinical studies of neurological disorders relevant to human health.
Optimizing Breast Cancer Detection in Mammograms: A Comprehensive Study of Transfer Learning, Resolution Reduction, and Multi-View Classification
Petrini, Daniel G. P., Kim, Hae Yong
This study explores open questions in the application of machine learning for breast cancer detection in mammograms. Current approaches often employ a two-stage transfer learning process: first, adapting a backbone model trained on natural images to develop a patch classifier, which is then used to create a single-view whole-image classifier. Additionally, many studies leverage both mammographic views to enhance model performance. In this work, we systematically investigate five key questions: (1) Is the intermediate patch classifier essential for optimal performance? (2) Do backbone models that excel in natural image classification consistently outperform others on mammograms? (3) When reducing mammogram resolution for GPU processing, does the learn-to-resize technique outperform conventional methods? (4) Does incorporating both mammographic views in a two-view classifier significantly improve detection accuracy? (5) How do these findings vary when analyzing low-quality versus high-quality mammograms? By addressing these questions, we developed models that outperform previous results for both single-view and two-view classifiers. Our findings provide insights into model architecture and transfer learning strategies contributing to more accurate and efficient mammogram analysis.
PAD: Towards Efficient Data Generation for Transfer Learning Using Phrase Alignment
Kim, Jong Myoung, Young-Jun_Lee, null, Choi, Ho-Jin, Jung, Sangkeun
Transfer learning leverages the abundance of English data to address the scarcity of resources in modeling non-English languages, such as Korean. In this study, we explore the potential of Phrase Aligned Data (PAD) from standardized Statistical Machine Translation (SMT) to enhance the efficiency of transfer learning. Through extensive experiments, we demonstrate that PAD synergizes effectively with the syntactic characteristics of the Korean language, mitigating the weaknesses of SMT and significantly improving model performance. Moreover, we reveal that PAD complements traditional data construction methods and enhances their effectiveness when combined. This innovative approach not only boosts model performance but also suggests a cost-efficient solution for resource-scarce languages.
Sample-Efficient Bayesian Transfer Learning for Online Machine Parameter Optimization
Wagner, Philipp, Nagel, Tobias, Leube, Philipp, Huber, Marco F.
Correctly setting the parameters of a production machine is essential to improve product quality, increase efficiency, and reduce production costs while also supporting sustainability goals. Identifying optimal parameters involves an iterative process of producing an object and evaluating its quality. Minimizing the number of iterations is, therefore, desirable to reduce the costs associated with unsuccessful attempts. This work introduces a method to optimize the machine parameters in the system itself using a Bayesian optimization algorithm. By leveraging existing machine data, we use a transfer learning approach in order to identify an optimum with minimal iterations, resulting in a cost-effective transfer learning algorithm. We validate our approach on a laser machine for cutting sheet metal in the real world.
PRIOT: Pruning-Based Integer-Only Transfer Learning for Embedded Systems
Anada, Honoka, Ryu, Sefutsu, Usui, Masayuki, Kaneko, Tatsuya, Takamaeda-Yamazaki, Shinya
On-device transfer learning is crucial for adapting a common backbone model to the unique environment of each edge device. Tiny microcontrollers, such as the Raspberry Pi Pico, are key targets for on-device learning but often lack floating-point units, necessitating integer-only training. Dynamic computation of quantization scale factors, which is adopted in former studies, incurs high computational costs. Therefore, this study focuses on integer-only training with static scale factors, which is challenging with existing training methods. We propose a new training method named PRIOT, which optimizes the network by pruning selected edges rather than updating weights, allowing effective training with static scale factors. The pruning pattern is determined by the edge-popup algorithm, which trains a parameter named score assigned to each edge instead of the original parameters and prunes the edges with low scores before inference. Additionally, we introduce a memory-efficient variant, PRIOT-S, which only assigns scores to a small fraction of edges. We implement PRIOT and PRIOT-S on the Raspberry Pi Pico and evaluate their accuracy and computational costs using a tiny CNN model on the rotated MNIST dataset and the VGG11 model on the rotated CIFAR-10 dataset. Our results demonstrate that PRIOT improves accuracy by 8.08 to 33.75 percentage points over existing methods, while PRIOT-S reduces memory footprint with minimal accuracy loss.
Realized Volatility Forecasting for New Issues and Spin-Offs using Multi-Source Transfer Learning
Teller, Andreas, Pigorsch, Uta, Pigorsch, Christian
Forecasting the volatility of financial assets is essential for various financial applications. This paper addresses the challenging task of forecasting the volatility of financial assets with limited historical data, such as new issues or spin-offs, by proposing a multi-source transfer learning approach. Specifically, we exploit complementary source data of assets with a substantial historical data record by selecting source time series instances that are most similar to the limited target data of the new issue/spin-off. Based on these instances and the target data, we estimate linear and non-linear realized volatility models and compare their forecasting performance to forecasts of models trained exclusively on the target data, and models trained on the entire source and target data. The results show that our transfer learning approach outperforms the alternative models and that the integration of complementary data is also beneficial immediately after the initial trading day of the new issue/spin-off.
Transfer Learning for Automated Feedback Generation on Small Datasets
Feedback is a very important part the learning process. However, it is challenging to make this feedback both timely and accurate when relying on human markers. This is the challenge that Automated Feedback Generation attempts to address. In this paper, a technique to train such a system on a very small dataset with very long sequences is presented. Both of these attributes make this a very challenging task, however, by using a three stage transfer learning pipeline state-of-the-art results can be achieved with qualitatively accurate but unhuman sounding results. The use of both Automated Essay Scoring and Automated Feedback Generation systems in the real world is also discussed.