AITopics | Transfer Learning

Collaborating Authors

Transfer Learning

Transfer Learning is the reuse of a pre-trained model on a new problem. (Towards Data Science)

News Overviews Instructional Materials AI-Alerts Classics

Universality in Transfer Learning for Linear Models

Ghane, Reza, Akhtiamov, Danil, Hassibi, Babak

arXiv.org Machine LearningOct-2-2024

Transfer learning is an attractive framework for problems where there is a paucity of data, or where data collection is costly. One common approach to transfer learning is referred to as "model-based", and involves using a model that is pretrained on samples from a source distribution, which is easier to acquire, and then fine-tuning the model on a few samples from the target distribution. The hope is that, if the source and target distributions are ``close", then the fine-tuned model will perform well on the target distribution even though it has seen only a few samples from it. In this work, we study the problem of transfer learning in linear models for both regression and binary classification. In particular, we consider the use of stochastic gradient descent (SGD) on a linear model initialized with pretrained weights and using a small training data set from the target distribution. In the asymptotic regime of large models, we provide an exact and rigorous analysis and relate the generalization errors (in regression) and classification errors (in binary classification) for the pretrained and fine-tuned models. In particular, we give conditions under which the fine-tuned model outperforms the pretrained one. An important aspect of our work is that all the results are "universal", in the sense that they depend only on the first and second order statistics of the target distribution. They thus extend well beyond the standard Gaussian assumptions commonly made in the literature.

assumption, matrix, universality, (15 more...)

arXiv.org Machine Learning

2410.02164

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Greece (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Scalable Multi-Task Transfer Learning for Molecular Property Prediction

Lee, Chanhui, Jeong, Dae-Woong, Ko, Sung Moon, Lee, Sumin, Kim, Hyunseung, Yim, Soorin, Han, Sehui, Kim, Sungwoong, Lim, Sungbin

arXiv.org Artificial IntelligenceOct-1-2024

Molecules have a number of distinct properties whose importance and application vary. Often, in reality, labels for some properties are hard to achieve despite their practical importance. A common solution to such data scarcity is to use models of good generalization with transfer learning. This involves domain experts for designing source and target tasks whose features are shared. However, this approach has limitations: i). Difficulty in accurate design of source-target task pairs due to the large number of tasks, and ii). corresponding computational burden verifying many trials and errors of transfer learning design, thereby iii). constraining the potential of foundation modeling of multi-task molecular property prediction. We address the limitations of the manual design of transfer learning via data-driven bi-level optimization. The proposed method enables scalable multi-task transfer learning for molecular property prediction by automatically obtaining the optimal transfer ratios. Empirically, the proposed method improved the prediction performance of 40 molecular properties and accelerated training convergence.

property prediction, scalable multi-task transfer learning, semanticscholar, (10 more...)

arXiv.org Artificial Intelligence

2410.00432

Country: Asia > South Korea > Seoul > Seoul (0.05)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)

Add feedback

Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning

Fu, Yu, He, Jie, Yang, Yifan, Liu, Qun, Xiong, Deyi

arXiv.org Artificial IntelligenceSep-27-2024

Meta learning has been widely used to exploit rich-resource source tasks to improve the performance of low-resource target tasks. Unfortunately, most existing meta learning approaches treat different source tasks equally, ignoring the relatedness of source tasks to the target task in knowledge transfer. To mitigate this issue, we propose a reinforcement-based multi-source meta-transfer learning framework (Meta-RTL) for low-resource commonsense reasoning. In this framework, we present a reinforcement-based approach to dynamically estimating source task weights that measure the contribution of the corresponding tasks to the target task in the meta-transfer learning. The differences between the general loss of the meta model and task-specific losses of source-specific temporal meta models on sampled target data are fed into the policy network of the reinforcement learning module as rewards. The policy network is built upon LSTMs that capture long-term dependencies on source task weight estimation across meta learning iterations. We evaluate the proposed Meta-RTL using both BERT and ALBERT as the backbone of the meta model on three commonsense reasoning benchmark datasets. Experimental results demonstrate that Meta-RTL substantially outperforms strong baselines and previous task selection strategies and achieves larger improvements on extremely low-resource settings.

artificial intelligence, commonsense reasoning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2409.19075

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
(5 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Transfer learning for financial data predictions: a systematic review

Lanzetta, V.

arXiv.org Artificial IntelligenceSep-24-2024

Literature highlighted that financial time series data pose significant challenges for accurate stock price prediction, because these data are characterized by noise and susceptibility to news; traditional statistical methodologies made assumptions, such as linearity and normality, which are not suitable for the non-linear nature of financial time series; on the other hand, machine learning methodologies are able to capture non linear relationship in the data. To date, neural network is considered the main machine learning tool for the financial prices prediction. Transfer Learning, as a method aimed at transferring knowledge from source tasks to target tasks, can represent a very useful methodological tool for getting better financial prediction capability. Current reviews on the above body of knowledge are mainly focused on neural network architectures, for financial prediction, with very little emphasis on the transfer learning methodology; thus, this paper is aimed at going deeper on this topic by developing a systematic review with respect to application of Transfer Learning for financial market predictions and to challenges/potential future directions of the transfer learning methodologies for stock market predictions.

architecture, market prediction, prediction, (15 more...)

arXiv.org Artificial Intelligence

2409.17183

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Switzerland (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.67)

Industry:

Banking & Finance > Trading (1.00)
Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generalization in birdsong classification: impact of transfer learning methods and dataset characteristics

Ghani, Burooj, Kalkman, Vincent J., Planqué, Bob, Vellinga, Willem-Pier, Gill, Lisa, Stowell, Dan

arXiv.org Artificial IntelligenceSep-21-2024

Animal sounds can be recognised automatically by machine learning, and this has an important role to play in biodiversity monitoring. Yet despite increasingly impressive capabilities, bioacoustic species classifiers still exhibit imbalanced performance across species and habitats, especially in complex soundscapes. In this study, we explore the effectiveness of transfer learning in large-scale bird sound classification across various conditions, including single- and multi-label scenarios, and across different model architectures such as CNNs and Transformers. Our experiments demonstrate that both fine-tuning and knowledge distillation yield strong performance, with cross-distillation proving particularly effective in improving in-domain performance on Xeno-canto data. However, when generalizing to soundscapes, shallow fine-tuning exhibits superior performance compared to knowledge distillation, highlighting its robustness and constrained nature. Our study further investigates how to use multi-species labels, in cases where these are present but incomplete. We advocate for more comprehensive labeling practices within the animal sound community, including annotating background species and providing temporal details, to enhance the training of robust bird sound classifiers. These findings provide insights into the optimal reuse of pretrained models for advancing automatic bioacoustic recognition.

classification, dataset, distillation, (16 more...)

arXiv.org Artificial Intelligence

2409.15383

Country:

Europe > Germany (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
North America > United States > Alaska (0.04)
Asia (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.64)

Add feedback

Transfer Learning for Passive Sonar Classification using Pre-trained Audio and ImageNet Models

Mohammadi, Amirmohammad, Kelhe, Tejashri, Carreiro, Davelle, Van Dine, Alexandra, Peeples, Joshua

arXiv.org Artificial IntelligenceSep-20-2024

Transfer learning is commonly employed to leverage large, pre-trained models and perform fine-tuning for downstream tasks. The most prevalent pre-trained models are initially trained using ImageNet. However, their ability to generalize can vary across different data modalities. This study compares pre-trained Audio Neural Networks (PANNs) and ImageNet pre-trained models within the context of underwater acoustic target recognition (UATR). It was observed that the ImageNet pre-trained models slightly out-perform pre-trained audio models in passive sonar classification. We also analyzed the impact of audio sampling rates for model pre-training and fine-tuning. This study contributes to transfer learning applications of UATR, illustrating the potential of pre-trained models to address limitations caused by scarce, labeled data in the UATR domain.

artificial intelligence, classification, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2409.13878

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
South America > Ecuador (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Government > Regional Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.82)

Add feedback

Transfer Learning for E-commerce Query Product Type Prediction

Tigunova, Anna, Ricatte, Thomas, Eraisha, Ghadir

arXiv.org Artificial IntelligenceSep-20-2024

Getting a good understanding of the customer intent is essential in e-commerce search engines. In particular, associating the correct product type to a search query plays a vital role in surfacing correct products to the customers. Query product type classification (Q2PT) is a particularly challenging task because search queries are short and ambiguous, the number of existing product categories is extremely large, spanning thousands of values. Moreover, international marketplaces face additional challenges, such as language and dialect diversity and cultural differences, influencing the interpretation of the query. In this work we focus on Q2PT prediction in the global multilocale e-commerce markets. The common approach of training Q2PT models for each locale separately shows significant performance drops in low-resource stores. Moreover, this method does not allow for a smooth expansion to a new country, requiring to collect the data and train a new locale-specific Q2PT model from scratch. To tackle this, we propose to use transfer learning from the highresource to the low-resource locales, to achieve global parity of Q2PT performance. We benchmark the per-locale Q2PT model against the unified one, which shares the training data and model structure across all worldwide stores. Additionally, we compare locale-aware and locale-agnostic Q2PT models, showing the task dependency on the country-specific traits. We conduct extensive quantiative and qualitative analysis of Q2PT models on the large-scale e-commerce dataset across 20 worldwide locales, which shows that unified locale-aware Q2PT model has superior performance over the alternatives.

artificial intelligence, information management, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.07121

Country:

North America > Canada (0.05)
Europe > France (0.04)
South America > Brazil (0.04)
(17 more...)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.61)

Add feedback

Rapid aerodynamic prediction of swept wings via physics-embedded transfer learning

Yang, Yunjia, Li, Runze, Zhang, Yufei, Lu, Lu, Chen, Haixin

arXiv.org Artificial IntelligenceSep-19-2024

Machine learning-based models provide a promising way to rapidly acquire transonic swept wing flow fields but suffer from large computational costs in establishing training datasets. Here, we propose a physics-embedded transfer learning framework to efficiently train the model by leveraging the idea that a three-dimensional flow field around wings can be analyzed with two-dimensional flow fields around cross-sectional airfoils. An airfoil aerodynamics prediction model is pretrained with airfoil samples. Then, an airfoil-to-wing transfer model is fine-tuned with a few wing samples to predict three-dimensional flow fields based on two-dimensional results on each spanwise cross section. Sweep theory is embedded when determining the corresponding airfoil geometry and operating conditions, and to obtain the sectional airfoil lift coefficient, which is one of the operating conditions, the low-fidelity vortex lattice method and data-driven methods are proposed and evaluated. Compared to a nontransfer model, introducing the pretrained model reduces the error by 30%, while introducing sweep theory further reduces the error by 9%. When reducing the dataset size, less than half of the wing training samples are need to reach the same error level as the nontransfer framework, which makes establishing the model much easier.

airfoil, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2409.12711

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (0.68)
Energy > Oil & Gas > Upstream (0.47)
Aerospace & Defense (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.64)

Add feedback

Location based Probabilistic Load Forecasting of EV Charging Sites: Deep Transfer Learning with Multi-Quantile Temporal Convolutional Network

Ali, Mohammad Wazed, Mustafa, Asif bin, Shuvo, Md. Aukerul Moin, Sick, Bernhard

arXiv.org Artificial IntelligenceSep-18-2024

Electrification of vehicles is a potential way of reducing fossil fuel usage and thus lessening environmental pollution. Electric Vehicles (EVs) of various types for different transport modes (including air, water, and land) are evolving. Moreover, different EV user groups (commuters, commercial or domestic users, drivers) may use different charging infrastructures (public, private, home, and workplace) at various times. Therefore, usage patterns and energy demand are very stochastic. Characterizing and forecasting the charging demand of these diverse EV usage profiles is essential in preventing power outages. Previously developed data-driven load models are limited to specific use cases and locations. None of these models are simultaneously adaptive enough to transfer knowledge of day-ahead forecasting among EV charging sites of diverse locations, trained with limited data, and cost-effective. This article presents a location-based load forecasting of EV charging sites using a deep Multi-Quantile Temporal Convolutional Network (MQ-TCN) to overcome the limitations of earlier models. We conducted our experiments on data from four charging sites, namely Caltech, JPL, Office-1, and NREL, which have diverse EV user types like students, full-time and part-time employees, random visitors, etc. With a Prediction Interval Coverage Probability (PICP) score of 93.62\%, our proposed deep MQ-TCN model exhibited a remarkable 28.93\% improvement over the XGBoost model for a day-ahead load forecasting at the JPL charging site. By transferring knowledge with the inductive Transfer Learning (TL) approach, the MQ-TCN model achieved a 96.88\% PICP score for the load forecasting task at the NREL site using only two weeks of data.

forecasting, load forecasting, office-1, (12 more...)

arXiv.org Artificial Intelligence

2409.11862

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Oceania > Australia (0.04)
North America > United States > California (0.04)
Asia > Bangladesh (0.04)

Genre: Research Report > New Finding (0.47)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)

Add feedback

Versatile Incremental Learning: Towards Class and Domain-Agnostic Incremental Learning

Park, Min-Yeong, Lee, Jae-Ho, Park, Gyeong-Moon

arXiv.org Artificial IntelligenceSep-17-2024

Incremental Learning (IL) aims to accumulate knowledge from sequential input tasks while overcoming catastrophic forgetting. Existing IL methods typically assume that an incoming task has only increments of classes or domains, referred to as Class IL (CIL) or Domain IL (DIL), respectively. In this work, we consider a more challenging and realistic but under-explored IL scenario, named Versatile Incremental Learning (VIL), in which a model has no prior of which of the classes or domains will increase in the next task. In the proposed VIL scenario, the model faces intra-class domain confusion and inter-domain class confusion, which makes the model fail to accumulate new knowledge without interference with learned knowledge. To address these issues, we propose a simple yet effective IL framework, named Incremental Classifier with Adaptation Shift cONtrol (ICON). Based on shifts of learnable modules, we design a novel regularization method called Cluster-based Adaptation Shift conTrol (CAST) to control the model to avoid confusion with the previously learned knowledge and thereby accumulate the new knowledge more effectively. Moreover, we introduce an Incremental Classifier (IC) which expands its output nodes to address the overwriting issue from different domains corresponding to a single class while maintaining the previous knowledge. We conducted extensive experiments on three benchmarks, showcasing the effectiveness of our method across all the scenarios, particularly in cases where the next task can be randomly altered. Our implementation code is available at https://github.com/KHU-AGI/VIL.

computer vision, proceedings, scenario, (10 more...)

arXiv.org Artificial Intelligence

2409.10956

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.34)

Add feedback