Goto

Collaborating Authors

 Transfer Learning


Cost-Sensitive Multi-Fidelity Bayesian Optimization with Transfer of Learning Curve Extrapolation

arXiv.org Artificial Intelligence

In this paper, we address the problem of cost-sensitive multi-fidelity Bayesian Optimization (BO) for efficient hyperparameter optimization (HPO). Specifically, we assume a scenario where users want to early-stop the BO when the performance improvement is not satisfactory with respect to the required computational cost. Motivated by this scenario, we introduce utility, which is a function predefined by each user and describes the trade-off between cost and performance of BO. This utility function, combined with our novel acquisition function and stopping criterion, allows us to dynamically choose for each BO step the best configuration that we expect to maximally improve the utility in future, and also automatically stop the BO around the maximum utility. Further, we improve the sample efficiency of existing learning curve (LC) extrapolation methods with transfer learning, while successfully capturing the correlations between different configurations to develop a sensible surrogate function for multi-fidelity BO. We validate our algorithm on various LC datasets and found it outperform all the previous multi-fidelity BO and transfer-BO baselines we consider, achieving significantly better trade-off between cost and performance of BO.


Efficient Remote Sensing with Harmonized Transfer Learning and Modality Alignment

arXiv.org Artificial Intelligence

With the rise of Visual and Language Pretraining (VLP), an increasing number of downstream tasks are adopting the paradigm of pretraining followed by fine-tuning. Although this paradigm has demonstrated potential in various multimodal downstream tasks, its implementation in the remote sensing domain encounters some obstacles. Specifically, the tendency for same-modality embeddings to cluster together impedes efficient transfer learning. To tackle this issue, we review the aim of multimodal transfer learning for downstream tasks from a unified perspective, and rethink the optimization process based on three distinct objectives. We propose "Harmonized Transfer Learning and Modality Alignment (HarMA)", a method that simultaneously satisfies task constraints, modality alignment, and single-modality uniform alignment, while minimizing training overhead through parameter-efficient fine-tuning. Remarkably, without the need for external data for training, HarMA achieves state-of-the-art performance in two popular multimodal retrieval tasks in the field of remote sensing. Our experiments reveal that HarMA achieves competitive and even superior performance to fully fine-tuned models with only minimal adjustable parameters. Due to its simplicity, HarMA can be integrated into almost all existing multimodal pretraining models. We hope this method can facilitate the efficient application of large models to a wide range of downstream tasks while significantly reducing the resource consumption. Code is available at https://github.com/seekerhuang/HarMA.


Transfer Learning Under High-Dimensional Graph Convolutional Regression Model for Node Classification

arXiv.org Machine Learning

Node classification is a fundamental task, but obtaining node classification labels can be challenging and expensive in many real-world scenarios. Transfer learning has emerged as a promising solution to address this challenge by leveraging knowledge from source domains to enhance learning in a target domain. Existing transfer learning methods for node classification primarily focus on integrating Graph Convolutional Networks (GCNs) with various transfer learning techniques. While these approaches have shown promising results, they often suffer from a lack of theoretical guarantees, restrictive conditions, and high sensitivity to hyperparameter choices. To overcome these limitations, we propose a Graph Convolutional Multinomial Logistic Regression (GCR) model and a transfer learning method based on the GCR model, called Trans-GCR. We provide theoretical guarantees of the estimate obtained under GCR model in high-dimensional settings. Moreover, Trans-GCR demonstrates superior empirical performance, has a low computational cost, and requires fewer hyperparameters than existing methods.


Transfer Learning with Informative Priors: Simple Baselines Better than Previously Reported

arXiv.org Artificial Intelligence

We pursue transfer learning to improve classifier accuracy on a target task with few labeled examples available for training. Recent work suggests that using a source task to learn a prior distribution over neural net weights, not just an initialization, can boost target task performance. In this study, we carefully compare transfer learning with and without source task informed priors across 5 datasets. We find that standard transfer learning informed by an initialization only performs far better than reported in previous comparisons. The relative gains of methods using informative priors over standard transfer learning vary in magnitude across datasets. For the scenario of 5-300 examples per class, we find negative or negligible gains on 2 datasets, modest gains (between 1.5-3 points of accuracy) on 2 other datasets, and substantial gains (>8 points) on one dataset. Among methods using informative priors, we find that an isotropic covariance appears competitive with learned low-rank covariance matrix while being substantially simpler to understand and tune. Further analysis suggests that the mechanistic justification for informed priors -- hypothesized improved alignment between train and test loss landscapes -- is not consistently supported due to high variability in empirical landscapes. We release code to allow independent reproduction of all experiments.


Marvelous Agglutinative Language Effect on Cross Lingual Transfer Learning

arXiv.org Artificial Intelligence

E-Commerce Services such as Amazon and Alibaba have international consumers. The languages used in those services are very diverse. In order to provide a product search system to international consumers, it is essential to develop artificial intelligence that captures the semantic similarities among different languages. In this study, we propose a model that captures similarities between various languages as one artificial intelligence model. As for multilingual language models, it is important to select languages for training because of the curse of multilinguality. It is known that using languages with similar language structures is effective for cross lingual transfer learning. However, we demonstrate that using agglutinative languages such as Korean is more effective in cross lingual transfer learning. This is a great discovery that will change the training strategy of cross lingual transfer learning.


Near-Field Spot Beamfocusing: A Correlation-Aware Transfer Learning Approach

arXiv.org Artificial Intelligence

3D spot beamfocusing (SBF), in contrast to conventional angular-domain beamforming, concentrates radiating power within very small volume in both radial and angular domains in the near-field zone. Recently the implementation of channel-state-information (CSI)-independent machine learning (ML)-based approaches have been developed for effective SBF using extremely-largescale-programable-metasurface (ELPMs). These methods involve dividing the ELPMs into subarrays and independently training them with Deep Reinforcement Learning to jointly focus the beam at the Desired Focal Point (DFP). This paper explores near-field SBF using ELPMs, addressing challenges associated with lengthy training times resulting from independent training of subarrays. To achieve a faster CSIindependent solution, inspired by the correlation between the beamfocusing matrices of the subarrays, we leverage transfer learning techniques. First, we introduce a novel similarity criterion based on the Phase Distribution Image of subarray apertures. Then we devise a subarray policy propagation scheme that transfers the knowledge from trained to untrained subarrays. We further enhance learning by introducing Quasi-Liquid-Layers as a revised version of the adaptive policy reuse technique. We show through simulations that the proposed scheme improves the training speed about 5 times. Furthermore, for dynamic DFP management, we devised a DFP policy blending process, which augments the convergence rate up to 8-fold.


Transfer Learning Approach for Railway Technical Map (RTM) Component Identification

arXiv.org Artificial Intelligence

Railway Transportation is extremely popular all around the globe and urges the requirement of digitized databases that includes railway track information with all railway track components such as signals, switches and mileposts (Figure 1). A Railway Technical Map (RTM) is a complex diagram (Figure 1) which includes all the information associated with a railway track. At present, most railway companies maintain RTMs designed with computer aided software, yet they are only available in PDF format. These contain partially distorted map components where identifying those components using basic digital image processing techniques is hard due to its complexity. This work focuses on implementing an automated system to generate CSV formatted files for given RTM input images containing all the digitized data that can be used with further decision support tools. The final formatted text will include the component associativity with mileposts, component names and descriptions.


Transfer Learning for CSI-based Positioning with Multi-environment Meta-learning

arXiv.org Artificial Intelligence

Utilizing deep learning (DL) techniques for radio-based positioning of user equipment (UE) through channel state information (CSI) fingerprints has demonstrated significant potential. DL models can extract complex characteristics from the CSI fingerprints of a particular environment and accurately predict the position of a UE. Nonetheless, the effectiveness of the DL model trained on CSI fingerprints is highly dependent on the particular training environment, limiting the trained model's applicability across different environments. This paper proposes a novel DL model structure consisting of two parts, where the first part aims at identifying features that are independent from any specific environment, while the second part combines those features in an environment specific way with the goal of positioning. To train such a two-part model, we propose the multi-environment meta-learning (MEML) approach for the first part to facilitate training across various environments, while the second part of the model is trained solely on data from a specific environment. Our findings indicate that employing the MEML approach for initializing the weights of the DL model for a new unseen environment significantly boosts the accuracy of UE positioning in the new target environment as well the reliability of its uncertainty estimation. This method outperforms traditional transfer learning methods, whether direct transfer learning (DTL) between environments or completely training from scratch with data from a new environment. The proposed approach is verified with real measurements for both line-of-sight (LOS) and non-LOS (NLOS) environments.


Prompt-Enhanced Spatio-Temporal Graph Transfer Learning

arXiv.org Artificial Intelligence

Spatio-temporal graph neural networks have demonstrated efficacy in capturing complex dependencies for urban computing tasks such as forecasting and kriging. However, their performance is constrained by the reliance on extensive data for training on specific tasks, which limits their adaptability to new urban domains with varied demands. Although transfer learning has been proposed to address this problem by leveraging knowledge across domains, cross-task generalization remains underexplored in spatio-temporal graph transfer learning methods due to the absence of a unified framework. To bridge this gap, we propose Spatio-Temporal Graph Prompting (STGP), a prompt-enhanced transfer learning framework capable of adapting to diverse tasks in data-scarce domains. Specifically, we first unify different tasks into a single template and introduce a task-agnostic network architecture that aligns with this template. This approach enables the capture of spatio-temporal dependencies shared across tasks. Furthermore, we employ learnable prompts to achieve domain and task transfer in a two-stage prompting pipeline, enabling the prompts to effectively capture domain knowledge and task-specific properties at each stage. Extensive experiments demonstrate that STGP outperforms state-of-the-art baselines in three downstream tasks forecasting, kriging, and extrapolation by a notable margin.


Control Theoretic Approach to Fine-Tuning and Transfer Learning

arXiv.org Artificial Intelligence

Given a training set in the form of a paired $(\mathcal{X},\mathcal{Y})$, we say that the control system $\dot x = f(x,u)$ has learned the paired set via the control $u^*$ if the system steers each point of $\mathcal{X}$ to its corresponding target in $\mathcal{Y}$. If the training set is expanded, most existing methods for finding a new control $u^*$ require starting from scratch, resulting in a quadratic increase in complexity with the number of points. To overcome this limitation, we introduce the concept of $\textit{ tuning without forgetting}$. We develop $\textit{an iterative algorithm}$ to tune the control $u^*$ when the training set expands, whereby points already in the paired set are still matched, and new training samples are learned. At each update of our method, the control $u^*$ is projected onto the kernel of the end-point mapping generated by the controlled dynamics at the learned samples. It ensures keeping the end-points for the previously learned samples constant while iteratively learning additional samples.