AITopics | Transfer Learning

Collaborating Authors

Transfer Learning

Transfer Learning is the reuse of a pre-trained model on a new problem. (Towards Data Science)

News Overviews Instructional Materials AI-Alerts Classics

DiBS-MTL: Transformation-Invariant Multitask Learning with Direction Oracles

Murthy, Surya, Gupta, Kushagra, Karabag, Mustafa O., Fridovich-Keil, David, Topcu, Ufuk

arXiv.org Artificial IntelligenceSep-30-2025

Multitask learning (MTL) algorithms typically rely on schemes that combine different task losses or their gradients through weighted averaging. These methods aim to find Pareto stationary points by using heuristics that require access to task loss values, gradients, or both. In doing so, a central challenge arises because task losses can be arbitrarily, nonaffinely scaled relative to one another, causing certain tasks to dominate training and degrade overall performance. A recent advance in cooperative bargaining theory, the Direction-based Bargaining Solution (DiBS), yields Pareto stationary solutions immune to task domination because of its invariance to monotonic nonaffine task loss transformations. However, the convergence behavior of DiBS in nonconvex MTL settings is currently not understood. To this end, we prove that under standard assumptions, a subsequence of DiBS iterates converges to a Pareto stationary point when task losses are possibly nonconvex, and propose DiBS-MTL, a computationally efficient adaptation of DiBS to the MTL setting. Finally, we validate DiBS-MTL empirically on standard MTL benchmarks, showing that it achieves competitive performance with state-of-the-art methods while maintaining robustness to nonaffine monotonic transformations that significantly degrade the performance of existing approaches, including prior bargaining-inspired MTL methods. Code available at https://github.com/suryakmurthy/dibs-mtl.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2509.23948

Country: North America > United States > Texas (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
(3 more...)

Add feedback

Transfer Learning and Machine Learning for Training Five Year Survival Prognostic Models in Early Breast Cancer

Pilgram, Lisa, Yang, Kai, Beltran-Bless, Ana-Alicia, Pond, Gregory R., Vandermeer, Lisa, Hilton, John, Savard, Marie-France, Leblanc, Andréanne, Sheperd, Lois, Chen, Bingshu E., Bartlett, John M. S., Taylor, Karen J., Bayani, Jane, Barker, Sarah L., Spears, Melanie, van der Velde, Cornelis J. H., Kranenbarg, Elma Meershoek-Klein, Dirix, Luc, Mallon, Elizabeth, Hasenburg, Annette, Markopoulos, Christos, Juwara, Lamin, Dankar, Fida K., Clemons, Mark, Emam, Khaled El

arXiv.org Artificial IntelligenceSep-30-2025

Prognostic information is essential for decision-making in breast cancer management. Recently trials have predominantly focused on genomic prognostication tools, even though clinicopathological prognostication is less costly and more widely accessible. Machine learning (ML), transfer learning and ensemble integration offer opportunities to build robust prognostication frameworks. We evaluate this potential to improve survival prognostication in breast cancer by comparing de-novo ML, transfer learning from a pre-trained prognostic tool and ensemble integration. Data from the MA.27 trial was used for model training, with external validation on the TEAM trial and a SEER cohort. Transfer learning was applied by fine-tuning the pre-trained prognostic tool PREDICT v3, de-novo ML included Random Survival Forests and Extreme Gradient Boosting, and ensemble integration was realized through a weighted sum of model predictions. Transfer learning, de-novo RSF, and ensemble integration improved calibration in MA.27 over the pre-trained model (ICI reduced from 0.042 in PREDICT v3 to <=0.007) while discrimination remained comparable (AUC increased from 0.738 in PREDICT v3 to 0.744-0.799). Invalid PREDICT v3 predictions were observed in 23.8-25.8% of MA.27 individuals due to missing information. In contrast, ML models and ensemble integration could predict survival regardless of missing information. Across all models, patient age, nodal status, pathological grading and tumor size had the highest SHAP values, indicating their importance for survival prognostication. External validation in SEER, but not in TEAM, confirmed the benefits of transfer learning, RSF and ensemble integration. This study demonstrates that transfer learning, de-novo RSF, and ensemble integration can improve prognostication in situations where relevant information for PREDICT v3 is lacking or where a dataset shift is likely.

artificial intelligence, information, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.23268

Country:

North America > United States (1.00)
North America > Canada > Ontario (1.00)
Europe > United Kingdom (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength Medium (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.85)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Multimodal Slice Interaction Network Enhanced by Transfer Learning for Precise Segmentation of Internal Gross Tumor Volume in Lung Cancer PET/CT Imaging

Luo, Yi, Guo, Yike, Hooshangnejad, Hamed, Zhang, Rui, Feng, Xue, Chen, Quan, Ngwa, Wil, Ding, Kai

arXiv.org Artificial IntelligenceSep-30-2025

Lung cancer remains the leading cause of cancerrelated deaths globally. Accurate delineation of internal gross tumor volume (IGTV) in PET/CT imaging is pivotal for optimal radiation therapy in mobile tumors such as lung cancer to account for tumor motion, yet is hindered by the limited availability of annotated IGTV datasets and attenuated PET signal intensity at tumor boundaries. In this study, we present a transfer learningbased methodology utilizing a multimodal interactive perception network with MAMBA, pre-trained on extensive gross tumor volume (GTV) datasets and subsequently fine-tuned on a private IGTV cohort. This cohort constitutes the PET/CT subset of the Lung-cancer Unified Cross-modal Imaging Dataset (LUCID). To further address the challenge of weak PET intensities in IGTV peripheral slices, we introduce a slice interaction module (SIM) within a 2.5D segmentation framework to effectively model inter-slice relationships. Our proposed module integrates channel and spatial attention branches with depthwise convolutions, enabling more robust learning of slice-to-slice dependencies and thereby improving overall segmentation performance. A comprehensive experimental evaluation demonstrates that our approach achieves a Dice of 0.609 on the private IGTV dataset, substantially surpassing the conventional baseline score of 0.385. This work highlights the potential of transfer learning, coupled with advanced multimodal techniques and a SIM to enhance the reliability and clinical relevance of IGTV segmentation for lung cancer radiation therapy planning.

artificial intelligence, machine learning, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2509.22841

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.63)

Add feedback

Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data

Neural Information Processing SystemsSep-29-2025, 21:58:37 GMT

Precision medicine aims for personalized prognosis and therapeutics by utilizing recent genome-scale high-throughput profiling techniques, including next-generation sequencing (NGS). However, translating NGS data faces several challenges. First, NGS count data are often overdispersed, requiring appropriate modeling. Second, compared to the number of involved molecules and system complexity, the number of available samples for studying complex disease, such as cancer, is often limited, especially considering disease heterogeneity. The key question is whether we may integrate available data from all different sources or domains to achieve reproducible disease prognosis based on NGS count data. In this paper, we develop a Bayesian Multi-Domain Learning (BMDL) model that derives domain-dependent latent representations of overdispersed count data based on hierarchical negative binomial factorization for accurate cancer subtyping even if the number of samples for a specific cancer type is small. Experimental results from both our simulated and NGS datasets from The Cancer Genome Atlas (TCGA) demonstrate the promising potential of BMDL for effective multi-domain learning without ``negative transfer'' effects often seen in existing multi-task learning and transfer learning methods.

bayesian multi-domain learning, cancer subtype discovery, count data, (4 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.60)

Add feedback

Transfer learning for multifidelity simulation-based inference in cosmology

Saoulis, Alex A., Piras, Davide, Jeffrey, Niall, Mancini, Alessio Spurio, Ferreira, Ana M. G., Joachimi, Benjamin

arXiv.org Artificial IntelligenceSep-29-2025

Simulation-based inference (SBI) enables cosmological parameter estimation when closed-form likelihoods or models are unavailable. However, SBI relies on machine learning for neural compression and density estimation. This requires large training datasets which are prohibitively expensive for high-quality simulations. We overcome this limitation with multifidelity transfer learning, combining less expensive, lower-fidelity simulations with a limited number of high-fidelity simulations. We demonstrate our methodology on dark matter density maps from two separate simulation suites in the hydrodynamical CAMELS Multifield Dataset. Pre-training on dark-matter-only $N$-body simulations reduces the required number of high-fidelity hydrodynamical simulations by a factor between $8$ and $15$, depending on the model complexity, posterior dimensionality, and performance metrics used. By leveraging cheaper simulations, our approach enables performant and accurate inference on high-fidelity models while substantially reducing computational costs.

artificial intelligence, machine learning, simulation, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1093/mnras/staf1436

2505.21215

Country:

Europe (1.00)
North America > United States (0.94)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.66)

Add feedback

Benchmarking for Practice: Few-Shot Time-Series Crop-Type Classification on the EuroCropsML Dataset

Reuss, Joana, Macdonald, Jan, Becker, Simon, Gikalo, Ekaterina, Schultka, Konrad, Richter, Lorenz, Körner, Marco

arXiv.org Artificial IntelligenceSep-26-2025

Accurate crop-type classification from satellite time series is essential for agricultural monitoring. While various machine learning algorithms have been developed to enhance performance on data-scarce tasks, their evaluation often lacks real-world scenarios. Consequently, their efficacy in challenging practical applications has not yet been profoundly assessed. To facilitate future research in this domain, we present the first comprehensive benchmark for evaluating supervised and SSL methods for crop-type classification under real-world conditions. This benchmark study relies on the EuroCropsML time-series dataset, which combines farmer-reported crop data with Sentinel-2 satellite observations from Estonia, Latvia, and Portugal. Our findings indicate that MAML-based meta-learning algorithms achieve slightly higher accuracy compared to supervised transfer learning and SSL methods. However, compared to simpler transfer learning, the improvement of meta-learning comes at the cost of increased computational demands and training time. Moreover, supervised methods benefit most when pre-trained and fine-tuned on geographically close regions. In addition, while SSL generally lags behind meta-learning, it demonstrates advantages over training from scratch, particularly in capturing fine-grained features essential for real-world crop-type classification, and also surpasses standard transfer learning. This highlights its practical value when labeled pre-training crop data is scarce. Our insights underscore the trade-offs between accuracy and computational demand in selecting supervised machine learning methods for real-world crop-type classification tasks and highlight the difficulties of knowledge transfer across diverse geographic regions. Furthermore, they demonstrate the practical value of SSL approaches when labeled pre-training crop data is scarce.

machine learning, natural language, text classification, (19 more...)

arXiv.org Artificial Intelligence

2504.11022

Country:

North America > United States (1.00)
Europe > Germany (0.67)

Genre: Research Report > New Finding (0.65)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.76)

Add feedback

SMILES-Inspired Transfer Learning for Quantum Operators in Generative Quantum Eigensolver

Yin, Zhi, Li, Xiaoran, Zhang, Shengyu, Li, Xin, Zhang, Xiaojin

arXiv.org Artificial IntelligenceSep-25-2025

Given the inherent limitations of traditional Variational Quantum Eigensolver(VQE) algorithms, the integration of deep generative models into hybrid quantum-classical frameworks, specifically the Generative Quantum Eigensolver(GQE), represents a promising innovative approach. However, taking the Unitary Coupled Cluster with Singles and Doubles(UCCSD) ansatz which is widely used in quantum chemistry as an example, different molecular systems require constructions of distinct quantum operators. Considering the similarity of different molecules, the construction of quantum operators utilizing the similarity can reduce the computational cost significantly. Inspired by the SMILES representation method in computational chemistry, we developed a text-based representation approach for UCCSD quantum operators by leveraging the inherent representational similarities between different molecular systems. This framework explores text pattern similarities in quantum operators and employs text similarity metrics to establish a transfer learning framework. Our approach with a naive baseline setting demonstrates knowledge transfer between different molecular systems for ground-state energy calculations within the GQE paradigm. This discovery offers significant benefits for hybrid quantum-classical computation of molecular ground-state energies, substantially reducing computational resource requirements.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.19715

Country: Asia > China (0.47)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Statistical Insight into Meta-Learning via Predictor Subspace Characterization and Quantification of Task Diversity

Datta, Saptati, Hengartner, Nicolas W., Pimonova, Yulia, Klein, Natalie E., Lubbers, Nicholas

arXiv.org Machine LearningSep-24-2025

In recent years, there has been significant interest in designing machine learning algorithms that enable robust and sample-efficient knowledge transfer across tasks to facilitate rapid and accurate estimation and prediction. Traditional machine learning methods have largely followed a single-task or "isolated learning" framework, where each task is learned independently, ignoring knowledge from prior tasks (Upadhyay et al., 2024). However, unlike such isolated approaches, human learning relies on prior experiences to accelerate new learning. Inspired by this, recent prominent "knowledge-transfer" approaches include meta-learning (Finn et al., 2017; Bouchattaoui, 2024), transfer learning (Zhu et al., 2023; Zhuang et al., 2020), multi-task learning (Crawshaw, 2020; Zhang and Yang, 2022), and lifelong learning (Liu, 2017), all of which aim to leverage shared structure across tasks to improve generalization and aim to replicate this human-like knowledge transfer. Meta-learning focuses on learning a learning algorithm that can quickly adapt to new tasks using limited data. Transfer learning reuses knowledge from related source tasks to improve performance on a target task with few labeled examples.

posterior distribution, prediction, subspace, (15 more...)

arXiv.org Machine Learning

2509.18349

Country:

North America > United States > Texas (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report (0.85)

Industry: Education > Educational Setting > Continuing Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Transfer learning under latent space model

Fang, Kuangnan, Qin, Ruixuan, Fan, Xinyan

arXiv.org Machine LearningSep-22-2025

Latent space model plays a crucial role in network analysis, and accurate estimation of latent variables is essential for downstream tasks such as link prediction. However, the large number of parameters to be estimated presents a challenge, especially when the latent space dimension is not exceptionally small. In this paper, we propose a transfer learning method that leverages information from networks with latent variables similar to those in the target network, thereby improving the estimation accuracy for the target. Given transferable source networks, we introduce a two-stage transfer learning algorithm that accommodates differences in node numbers between source and target networks. In each stage, we derive sufficient identification conditions and design tailored projected gradient descent algorithms for estimation. Theoretical properties of the resulting estimators are established. When the transferable networks are unknown, a detection algorithm is introduced to identify suitable source networks. Simulation studies and analyses of two real datasets demonstrate the effectiveness of the proposed methods.

latent variable, source network, target network, (14 more...)

arXiv.org Machine Learning

2509.15797

Country:

Asia > China > Fujian Province > Xiamen (0.04)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

SETrLUSI: Stochastic Ensemble Multi-Source Transfer Learning Using Statistical Invariant

Li, Chunna, Song, Yiwei, Shao, Yuanhai

arXiv.org Machine LearningSep-22-2025

In transfer learning, a source domain often carries diverse knowledge, and different domains usually emphasize different types of knowledge. Different from handling only a single type of knowledge from all domains in traditional transfer learning methods, we introduce an ensemble learning framework with a weak mode of convergence in the form of Statistical Invariant (SI) for multi-source transfer learning, formulated as Stochastic Ensemble Multi-Source Transfer Learning Using Statistical Invariant (SETrLUSI). The proposed SI extracts and integrates various types of knowledge from both source and target domains, which not only effectively utilizes diverse knowledge but also accelerates the convergence process. Further, SETrLUSI incorporates stochastic SI selection, proportional source domain sampling, and target domain bootstrapping, which improves training efficiency while enhancing model stability. Experiments show that SETrLUSI has good convergence and outperforms related methods with a lower time cost.

knowledge, source domain, target domain, (10 more...)

arXiv.org Machine Learning

2509.15593

Country:

North America > United States > California (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback