AITopics | Transfer Learning

Collaborating Authors

Transfer Learning

Transfer Learning is the reuse of a pre-trained model on a new problem. (Towards Data Science)

News Overviews Instructional Materials AI-Alerts Classics

A Transfer Learning Framework for Multilayer Networks via Model Averaging

arXiv.org Machine LearningJun-17-2025

Link prediction in multilayer networks is a key challenge in applications such as recommendation systems and protein-protein interaction prediction. While many techniques have been developed, most rely on assumptions about shared structures and require access to raw auxiliary data, limiting their practicality. To address these issues, we propose a novel transfer learning framework for multilayer networks using a bi-level model averaging method. A $K$-fold cross-validation criterion based on edges is used to automatically weight inter-layer and intra-layer candidate models. This enables the transfer of information from auxiliary layers while mitigating model uncertainty, even without prior knowledge of shared structures. Theoretically, we prove the optimality and weight convergence of our method under mild conditions. Computationally, our framework is efficient and privacy-preserving, as it avoids raw data sharing and supports parallel processing across multiple servers. Simulations show our method outperforms others in predictive accuracy and robustness. We further demonstrate its practical value through two real-world recommendation system applications.

data mining, machine learning, prediction, (20 more...)

arXiv.org Machine Learning

2506.12455

Country:

North America > United States (0.14)
Asia > China > Anhui Province > Hefei (0.04)
Europe > Denmark (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Geometric Embedding Alignment via Curvature Matching in Transfer Learning

Ko, Sung Moon, Lee, Jaewan, Lee, Sumin, Yim, Soorin, Bae, Kyunghoon, Han, Sehui

arXiv.org Artificial IntelligenceJun-17-2025

Geometrical interpretations of deep learning models offer insightful perspectives into their underlying mathematical structures. In this work, we introduce a novel approach that leverages differential geometry, particularly concepts from Riemannian geometry, to integrate multiple models into a unified transfer learning framework. By aligning the Ricci curvature of latent space of individual models, we construct an interrelated architecture, namely Geometric Embedding Alignment via cuRvature matching in transfer learning (GEAR), which ensures comprehensive geometric representation across datapoints. This framework enables the effective aggregation of knowledge from diverse sources, thereby improving performance on target tasks. We evaluate our model on 23 molecular task pairs sourced from various domains and demonstrate significant performance gains over existing benchmark model under both random (14.4%) and scaffold (8.3%) data splits.

artificial intelligence, derivative, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.13015

Country: North America > United States (0.28)

Genre: Research Report > Promising Solution (0.48)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neurosymbolic Artificial Intelligence for Robust Network Intrusion Detection: From Scratch to Transfer Learning

Tran, Huynh T. T., Sander, Jacob, Cohen, Achraf, Jalaian, Brian, Bastian, Nathaniel D.

arXiv.org Artificial IntelligenceJun-6-2025

Network Intrusion Detection Systems (NIDS) play a vital role in protecting digital infrastructures against increasingly sophisticated cyber threats. In this paper, we extend ODXU, a Neurosymbolic AI (NSAI) framework that integrates deep embedded clustering for feature extraction, symbolic reasoning using XGBoost, and comprehensive uncertainty quantification (UQ) to enhance robustness, interpretability, and generalization in NIDS. The extended ODXU incorporates score-based methods (e.g., Confidence Scoring, Shannon Entropy) and metamodel-based techniques, including SHAP values and Information Gain, to assess the reliability of predictions. Experimental results on the CIC-IDS-2017 dataset show that ODXU outperforms traditional neural models across six evaluation metrics, including classification accuracy and false omission rate. While transfer learning has seen widespread adoption in fields such as computer vision and natural language processing, its potential in cybersecurity has not been thoroughly explored. To bridge this gap, we develop a transfer learning strategy that enables the reuse of a pre-trained ODXU model on a different dataset. Our ablation study on ACI-IoT-2023 demonstrates that the optimal transfer configuration involves reusing the pre-trained autoencoder, retraining the clustering module, and fine-tuning the XGBoost classifier, and outperforms traditional neural models when trained with as few as 16,000 samples (approximately 50% of the training data). Additionally, results show that metamodel-based UQ methods consistently outperform score-based approaches on both datasets.

dataset, machine learning, neurosymbolic artificial intelligence, (11 more...)

arXiv.org Artificial Intelligence

2506.04454

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.86)

Add feedback

Attractor learning for spatiotemporally chaotic dynamical systems using echo state networks with transfer learning

Alam, Mohammad Shah, Ott, William, Timofeyev, Ilya

arXiv.org Machine LearningJun-2-2025

In this paper, we explore the predictive capabilities of echo state networks (ESNs) for the generalized Kuramoto-Sivashinsky (gKS) equation, an archetypal nonlinear PDE that exhibits spatiotemporal chaos. We introduce a novel methodology that integrates ESNs with transfer learning, aiming to enhance predictive performance across various parameter regimes of the gKS model. Our research focuses on predicting changes in long-term statistical patterns of the gKS model that result from varying the dispersion relation or the length of the spatial domain. We use transfer learning to adapt ESNs to different parameter settings and successfully capture changes in the underlying chaotic attractor.

artificial intelligence, machine learning, power spectrum, (16 more...)

arXiv.org Machine Learning

2505.24099

Country:

North America > United States > Texas > Harris County > Houston (0.14)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Europe > France (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.87)

Add feedback

Evaluating Query Efficiency and Accuracy of Transfer Learning-based Model Extraction Attack in Federated Learning

Ahamed, Sayyed Farid, Roy, Sandip, Banerjee, Soumya, Vucovich, Marc, Choi, Kevin, Rahman, Abdul, Hu, Alison, Bowen, Edward, Shetty, Sachin

arXiv.org Artificial IntelligenceJun-2-2025

Federated Learning (FL) is a collaborative learning framework designed to protect client data, yet it remains highly vulnerable to Intellectual Property (IP) threats. Model extraction (ME) attacks pose a significant risk to Machine Learning as a Service (MLaaS) platforms, enabling attackers to replicate confidential models by querying black-box (without internal insight) APIs. Despite FL's privacy-preserving goals, its distributed nature makes it particularly susceptible to such attacks. This paper examines the vulnerability of FL-based victim models to two types of model extraction attacks. For various federated clients built under the NVFlare platform, we implemented ME attacks across two deep learning architectures and three image datasets. We evaluate the proposed ME attack performance using various metrics, including accuracy, fidelity, and KL divergence. The experiments show that for different FL clients, the accuracy and fidelity of the extracted model are closely related to the size of the attack query set. Additionally, we explore a transfer learning based approach where pretrained models serve as the starting point for the extraction process. The results indicate that the accuracy and fidelity of the fine-tuned pretrained extraction models are notably higher, particularly with smaller query sets, highlighting potential advantages for attackers.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.23791

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Reviews: Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning

Neural Information Processing SystemsMay-31-2025, 21:42:11 GMT

My concern about generalization still remains, and I hope the authors can devote maybe a sentence or two to it in the final draft - even something to the effect of "it is a concern; experimental evidence suggests it is not a great concern."] Summary: For any given ML algorithm, e.g., random forests, the paper proposes a transfer-learning approach for selection of hyperparameters (limited to those parameters that can be ordered) wherein a bounding space is constructed from previous evaluations of that algorithm on other datasets. Two types of bounding spaces are described. The box space is the tightest bounding box covering the best known hyperparameter settings for previous datasets. The ellipsoid is found as the smallest-volume ellipsoid covering the best known settings (via convex optimization).

bayesian optimization, dataset, search space, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.62)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.55)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.44)

Add feedback

Epistemic Errors of Imperfect Multitask Learners When Distributions Shift

Sloman, Sabina J., Caprio, Michele, Kaski, Samuel

arXiv.org Machine LearningMay-30-2025

When data are noisy, a statistical learner's goal is to resolve epistemic uncertainty about the data it will encounter at test-time, i.e., to identify the distribution of test (target) data. Many real-world learning settings introduce sources of epistemic uncertainty that can not be resolved on the basis of training (source) data alone: The source data may arise from multiple tasks (multitask learning), the target data may differ systematically from the source data tasks (distribution shift), and/or the learner may not arrive at an accurate characterization of the source data (imperfect learning). We introduce a principled definition of epistemic error, and provide a generic, decompositional epistemic error bound. Our error bound is the first to (i) consider epistemic error specifically, (ii) accommodate all the sources of epistemic uncertainty above, and (iii) separately attribute the error to each of multiple aspects of the learning procedure and environment. As corollaries of the generic result, we provide (i) epistemic error bounds specialized to the settings of Bayesian transfer learning and distribution shift within $ε$-neighborhoods, and (ii) a set of corresponding generalization bounds. Finally, we provide a novel definition of negative transfer, and validate its insights in a synthetic experimental setting.

artificial intelligence, learner, machine learning, (16 more...)

arXiv.org Machine Learning

2505.23496

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Measuring Fine-Grained Relatedness in Multitask Learning via Data Attribution

Tu, Yiwen, Liu, Ziqi, Ma, Jiaqi W., Tang, Weijing

arXiv.org Artificial IntelligenceMay-28-2025

Measuring task relatedness and mitigating negative transfer remain a critical open challenge in Multitask Learning (MTL). This work extends data attribution -- which quantifies the influence of individual training data points on model predictions -- to MTL setting for measuring task relatedness. We propose the MultiTask Influence Function (MTIF), a method that adapts influence functions to MTL models with hard or soft parameter sharing. Compared to conventional task relatedness measurements, MTIF provides a fine-grained, instance-level relatedness measure beyond the entire-task level. This fine-grained relatedness measure enables a data selection strategy to effectively mitigate negative transfer in MTL. Through extensive experiments, we demonstrate that the proposed MTIF efficiently and accurately approximates the performance of models trained on data subsets. Moreover, the data selection strategy enabled by MTIF consistently improves model performance in MTL. Our work establishes a novel connection between data attribution and MTL, offering an efficient and fine-grained solution for measuring task relatedness and enhancing MTL models.

artificial intelligence, dataset, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2505.21438

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Transfer Learning for Diffusion Models

Neural Information Processing SystemsMay-27-2025, 21:33:34 GMT

Diffusion models, a specific type of generative model, have achieved unprecedented performance in recent years and consistently produce high-quality synthetic samples. A critical prerequisite for their notable success lies in the presence of a substantial number of training samples, which can be impractical in real-world applications due to high collection costs or associated risks. Consequently, various finetuning and regularization approaches have been proposed to transfer knowledge from existing pre-trained models to specific target domains with limited data. This paper introduces the Transfer Guided Diffusion Process (TGDP), a novel approach distinct from conventional finetuning and regularization methods. We prove that the optimal diffusion model for the target domain integrates pre-trained diffusion models on the source domain with additional guidance from a domain classifier.

diffusion model, tgdp, transfer learning

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.40)

Add feedback

Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors

Neural Information Processing SystemsMay-27-2025, 19:56:27 GMT

Deep learning is increasingly moving towards a transfer learning paradigm whereby large foundation models are fine-tuned on downstream tasks, starting from an initialization learned on the source task. But an initialization contains relatively little information about the source task, and does not reflect the belief that our knowledge of the source task should affect the locations and shape of optima on the downstream task.Instead, we show that we can learn highly informative posteriors from the source task, through supervised or self-supervised approaches, which then serve as the basis for priors that modify the whole loss surface on the downstream task. This simple modular approach enables significant performance gains and more data-efficient learning on a variety of downstream classification and segmentation tasks, serving as a drop-in replacement for standard pre-training strategies. These highly informative priors also can be saved for future use, similar to pre-trained weights, and stand in contrast to the zero-mean isotropic uninformative priors that are typically used in Bayesian deep learning.

artificial intelligence, easy bayesian transfer learning, machine learning, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback