AITopics | Transfer Learning

Collaborating Authors

Transfer Learning

Transfer Learning is the reuse of a pre-trained model on a new problem. (Towards Data Science)

News Overviews Instructional Materials AI-Alerts Classics

Minimax optimal transfer learning for high-dimensional additive regression

arXiv.org Machine LearningSep-17-2025

Many human tasks benefit from prior experience when that experience is related to the task at hand. This phenomenon, whereby knowledge from previous tasks is transferred to new ones, has motivated the machine learning technique known as transfer learning. From a statistical perspective, consider the problem of analyzing a regression relationship when the available data are limited. Transfer learning (Torrey and Shavlik (2010)), one of the most widely used techniques in machine learning, can provide a solution. In this framework, one typically leverages related estimates obtained from large but non-identically distributed auxiliary samples, and then refines these estimates to obtain improved estimators from the smaller target sample. Transfer learning has been shown to be effective in a wide range of real-world applications, including computer vision (Kolesnikov et al. (2020); Bu et al. (2021)), natural language processing (Lee et al. (2020); Yuan et al. (2020)), and bioinformatics (Vorontsov et al. (2024); Gao and Cui (2020)), among others. Recently, the theoretical properties of transfer-learned estimators have been extensively investigated across a range of statistical problems.

denote, jp 0, tp 0, (17 more...)

arXiv.org Machine Learning

2509.06308

Country:

North America > United States > New York (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.45)
Research Report > Promising Solution (0.33)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.82)

Add feedback

Neurosymbolic AI Transfer Learning Improves Network Intrusion Detection

Tran, Huynh T. T., Sander, Jacob, Cohen, Achraf, Jalaian, Brian, Bastian, Nathaniel D.

arXiv.org Artificial IntelligenceSep-16-2025

Transfer learning is commonly utilized in various fields such as computer vision, natural language processing, and medical imaging due to its impressive capability to address subtasks and work with different datasets. However, its application in cybersecurity has not been thoroughly explored. In this paper, we present an innovative neurosymbolic AI framework designed for network intrusion detection systems, which play a crucial role in combating malicious activities in cybersecurity. Our framework leverages transfer learning and uncertainty quantification. The findings indicate that transfer learning models, trained on large and well-structured datasets, outperform neural-based models that rely on smaller datasets, paving the way for a new era in cybersecurity solutions.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.1085

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military > Cyberwarfare (0.76)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
(3 more...)

Add feedback

Multi-Label Transfer Learning in Non-Stationary Data Streams

Du, Honghui, Minku, Leandro, Lawlor, Aonghus, Zhou, Huiyu

arXiv.org Artificial IntelligenceSep-11-2025

Abstract--Label concepts in multi-label data streams often experience drift in non-stationary environments, either independently or in relation to other labels. Transferring knowledge between related labels can accelerate adaptation, yet research on multi-label transfer learning for data streams remains limited. T o address this, we propose two novel transfer learning methods: BR-MARLENE leverages knowledge from different labels in both source and target streams for multi-label classification; BRPW-MARLENE builds on this by explicitly modelling and transferring pairwise label dependencies to enhance learning performance. Comprehensive experiments show that both methods outperform state-of-the-art multi-label stream approaches in non-stationary environments, demonstrating the effectiveness of inter-label knowledge transfer for improved predictive performance. Index T erms--Concept drift, non-stationary environment, multi-source, multi-label, class imbalance, transfer learning. Most research on data stream learning concentrates on streams with single labels [1]. However, many practical data streaming applications naturally adopt a multi-label paradigm, where each incoming data example has more than one label [2]. For example, a social media post could be tagged with several descriptors, or a movie might be classified under various predefined genres (e.g., Action, Crime, Historical), with each tag or genre representing a unique label.

artificial intelligence, data stream, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.08181

Country: Europe > United Kingdom (0.46)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)

Add feedback

MARLINE: Multi-Source Mapping Transfer Learning for Non-Stationary Environments

Du, Honghui, Minku, Leandro, Zhou, Huiyu

arXiv.org Artificial IntelligenceSep-11-2025

Concept drift is a major problem in online learning due to its impact on the predictive performance of data stream mining systems. Recent studies have started exploring data streams from different sources as a strategy to tackle concept drift in a given target domain. These approaches make the assumption that at least one of the source models represents a concept similar to the target concept, which may not hold in many real-world scenarios. In this paper, we propose a novel approach called Multi-source mApping with tRansfer LearnIng for Non-stationary Environments (MARLINE). MARLINE can benefit from knowledge from multiple data sources in non-stationary environments even when source and target concepts do not match. This is achieved by projecting the target concept to the space of each source concept, enabling multiple source sub-classifiers to contribute towards the prediction of the target concept as part of an ensemble. Experiments on several synthetic and real-world datasets show that MARLINE was more accurate than several state-of-the-art data stream learning approaches.

artificial intelligence, machine learning, marline, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICDM50108.2020.00021

2509.08176

Country: Europe > United Kingdom > England (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Education > Educational Setting (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.72)

Add feedback

Enhancing Privacy Preservation and Reducing Analysis Time with Federated Transfer Learning in Digital Twins-based Computed Tomography Scan Analysis

Jan, Avais, Zia, Qasim, Patterson, Murray

arXiv.org Artificial IntelligenceSep-11-2025

The application of Digital Twin (DT) technology and Federated Learning (FL) has great potential to change the field of biomedical image analysis, particularly for Computed Tomography (CT) scans. This paper presents Federated Transfer Learning (FTL) as a new Digital Twin-based CT scan analysis paradigm. FTL uses pre-trained models and knowledge transfer between peer nodes to solve problems such as data privacy, limited computing resources, and data heterogeneity. The proposed framework allows real-time collaboration between cloud servers and Digital Twin-enabled CT scanners while protecting patient identity. We apply the FTL method to a heterogeneous CT scan dataset and assess model performance using convergence time, model accuracy, precision, recall, F1 score, and confusion matrix. It has been shown to perform better than conventional FL and Clustered Federated Learning (CFL) methods with better precision, accuracy, recall, and F1-score. The technique is beneficial in settings where the data is not independently and identically distributed (non-IID), and it offers reliable, efficient, and secure solutions for medical diagnosis. These findings highlight the possibility of using FTL to improve decision-making in digital twin-based CT scan analysis, secure and efficient medical image analysis, promote privacy, and open new possibilities for applying precision medicine and smart healthcare systems.

artificial intelligence, learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.08018

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.66)

Add feedback

Robust and Adaptive Spectral Method for Representation Multi-Task Learning with Contamination

Huang, Yian, Feng, Yang, Ying, Zhiliang

arXiv.org Machine LearningSep-9-2025

Representation-based multi-task learning (MTL) improves efficiency by learning a shared structure across tasks, but its practical application is often hindered by contamination, outliers, or adversarial tasks. Most existing methods and theories assume a clean or near-clean setting, failing when contamination is significant. This paper tackles representation MTL with an unknown and potentially large contamination proportion, while also allowing for heterogeneity among inlier tasks. We introduce a Robust and Adaptive Spectral method (RAS) that can distill the shared inlier representation effectively and efficiently, while requiring no prior knowledge of the contamination level or the true representation dimension. Theoretically, we provide non-asymptotic error bounds for both the learned representation and the per-task parameters. These bounds adapt to inlier task similarity and outlier structure, and guarantee that RAS performs at least as well as single-task learning, thus preventing negative transfer. We also extend our framework to transfer learning with corresponding theoretical guarantees for the target task. Extensive experiments confirm our theory, showcasing the robustness and adaptivity of RAS, and its superior performance in regimes with up to 80\% task contamination.

contamination proportion, estimation error, representation, (13 more...)

arXiv.org Machine Learning

2509.06575

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > New York (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

An Analysis of Layer-Freezing Strategies for Enhanced Transfer Learning in YOLO Architectures

Dobrzycki, Andrzej D., Bernardos, Ana M., Casar, José R.

arXiv.org Artificial IntelligenceSep-9-2025

The You Only Look Once (YOLO) architecture is crucial for real-time object detection. However, deploying it in resource-constrained environments such as unmanned aerial vehicles (UAVs) requires efficient transfer learning. Although layer freezing is a common technique, the specific impact of various freezing configurations on contemporary YOLOv8 and YOLOv10 architectures remains unexplored, particularly with regard to the interplay between freezing depth, dataset characteristics, and training dynamics. This research addresses this gap by presenting a detailed analysis of layer-freezing strategies. We systematically investigate multiple freezing configurations across YOLOv8 and YOLOv10 variants using four challenging datasets that represent critical infrastructure monitoring. Our methodology integrates a gradient behavior analysis (L2 norm) and visual explanations (Grad-CAM) to provide deeper insights into training dynamics under different freezing strategies. Our results reveal that there is no universal optimal freezing strategy but, rather, one that depends on the properties of the data. For example, freezing the backbone is effective for preserving general-purpose features, while a shallower freeze is better suited to handling extreme class imbalance. These configurations reduce graphics processing unit (GPU) memory consumption by up to 28% compared to full fine-tuning and, in some cases, achieve mean average precision (mAP@50) scores that surpass those of full fine-tuning. Gradient analysis corroborates these findings, showing distinct convergence patterns for moderately frozen models. Ultimately, this work provides empirical findings and practical guidelines for selecting freezing strategies. It offers a practical, evidence-based approach to balanced transfer learning for object detection in scenarios with limited resources.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/math13152539

2509.0549

Country:

Europe (1.00)
Asia (0.92)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.66)
Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.48)

Add feedback

Transfer Learning for Minimum Operating Voltage Prediction in Advanced Technology Nodes: Leveraging Legacy Data and Silicon Odometer Sensing

Yin, Yuxuan, Chen, Rebecca, Xu, Boxun, He, Chen, Li, Peng

arXiv.org Artificial IntelligenceSep-3-2025

Accurate prediction of chip performance is critical for ensuring energy efficiency and reliability in semiconductor manufacturing. However, developing minimum operating voltage ($V_{min}$) prediction models at advanced technology nodes is challenging due to limited training data and the complex relationship between process variations and $V_{min}$. To address these issues, we propose a novel transfer learning framework that leverages abundant legacy data from the 16nm technology node to enable accurate $V_{min}$ prediction at the advanced 5nm node. A key innovation of our approach is the integration of input features derived from on-chip silicon odometer sensor data, which provide fine-grained characterization of localized process variations -- an essential factor at the 5nm node -- resulting in significantly improved prediction accuracy.

artificial intelligence, machine learning, technology node, (14 more...)

arXiv.org Artificial Intelligence

2509.00035

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Hardware (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

C-Flat++: Towards a More Efficient and Powerful Framework for Continual Learning

Li, Wei, Yuan, Hangjie, Zhao, Zixiang, Zhu, Yifan, Lu, Aojun, Feng, Tao, Sun, Yanan

arXiv.org Artificial IntelligenceSep-1-2025

Balancing sensitivity to new tasks and stability for retaining past knowledge is crucial in continual learning (CL). Recently, sharpness-aware minimization has proven effective in transfer learning and has also been adopted in continual learning (CL) to improve memory retention and learning efficiency. However, relying on zeroth-order sharpness alone may favor sharper minima over flatter ones in certain settings, leading to less robust and potentially suboptimal solutions. In this paper, we propose \textbf{C}ontinual \textbf{Flat}ness (\textbf{C-Flat}), a method that promotes flatter loss landscapes tailored for CL. C-Flat offers plug-and-play compatibility, enabling easy integration with minimal modifications to the code pipeline. Besides, we present a general framework that integrates C-Flat into all major CL paradigms and conduct comprehensive comparisons with loss-minima optimizers and flat-minima-based CL methods. Our results show that C-Flat consistently improves performance across a wide range of settings. In addition, we introduce C-Flat++, an efficient yet effective framework that leverages selective flatness-driven promotion, significantly reducing the update cost required by C-Flat. Extensive experiments across multiple CL methods, datasets, and scenarios demonstrate the effectiveness and efficiency of our proposed approaches. Code is available at https://github.com/WanNaa/C-Flat.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2508.1886

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.86)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.34)

Add feedback

Sensitivity of Stability: Theoretical & Empirical Analysis of Replicability for Adaptive Data Selection in Transfer Learning

Singh, Prabhav, Sorrell, Jessica

arXiv.org Artificial IntelligenceSep-1-2025

The widespread adoption of transfer learning has revolutionized machine learning by enabling efficient adaptation of pre-trained models to new domains. However, the reliability of these adaptations remains poorly understood, particularly when using adaptive data selection strategies that dynamically prioritize training examples. We present a comprehensive theoretical and empirical analysis of replicability in transfer learning, introducing a mathematical framework that quantifies the fundamental trade-off between adaptation effectiveness and result consistency. Our key contribution is the formalization of selection sensitivity ($Δ_Q$), a measure that captures how adaptive selection strategies respond to perturbations in training data. We prove that replicability failure probability: the likelihood that two independent training runs produce models differing in performance by more than a threshold, increases quadratically with selection sensitivity while decreasing exponentially with sample size. Through extensive experiments on the MultiNLI corpus using six adaptive selection strategies - ranging from uniform sampling to gradient-based selection - we demonstrate that this theoretical relationship holds precisely in practice. Our results reveal that highly adaptive strategies like gradient-based and curriculum learning achieve superior task performance but suffer from high replicability failure rates, while less adaptive approaches maintain failure rates below 7%. Crucially, we show that source domain pretraining provides a powerful mitigation mechanism, reducing failure rates by up to 30% while preserving performance gains. These findings establish principled guidelines for practitioners to navigate the performance-replicability trade-off and highlight the need for replicability-aware design in modern transfer learning systems.

artificial intelligence, machine learning, replicability, (14 more...)

arXiv.org Artificial Intelligence

2508.04901

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback