Goto

Collaborating Authors

 Transfer Learning


Transfer Learning and the Early Estimation of Single-Photon Source Quality using Machine Learning Methods

arXiv.org Artificial Intelligence

The use of single-photon sources (SPSs) is central to numerous systems and devices proposed amidst a modern surge in quantum technology. However, manufacturing schemes remain imperfect, and single-photon emission purity must often be experimentally verified via interferometry. Such a process is typically slow and costly, which has motivated growing research into whether SPS quality can be more rapidly inferred from incomplete emission statistics. Hence, this study is a sequel to previous work that demonstrated significant uncertainty in the standard method of quality estimation, i.e. the least-squares fitting of a physically motivated function, and asks: can machine learning (ML) do better? The study leverages eight datasets obtained from measurements involving an exemplary quantum emitter, i.e. a single InGaAs/GaAs epitaxial quantum dot; these eight contexts predominantly vary in the intensity of the exciting laser. Specifically, via a form of `transfer learning', five ML models, three linear and two ensemble-based, are trained on data from seven of the contexts and tested on the eighth. Validation metrics quickly reveal that even a linear regressor can outperform standard fitting when it is tested on the same contexts it was trained on, but the success of transfer learning is less assured, even though statistical analysis, made possible by data augmentation, suggests its superiority as an early estimator. Accordingly, the study concludes by discussing future strategies for grappling with the problem of SPS context dissimilarity, e.g. feature engineering and model adaptation.


Prompt Your Brain: Scaffold Prompt Tuning for Efficient Adaptation of fMRI Pre-trained Model

arXiv.org Artificial Intelligence

We introduce Scaffold Prompt Tuning (ScaPT), a novel prompt-based framework for adapting large-scale functional magnetic resonance imaging (fMRI) pre-trained models to downstream tasks, with high parameter efficiency and improved performance compared to fine-tuning and baselines for prompt tuning. The full fine-tuning updates all pre-trained parameters, which may distort the learned feature space and lead to overfitting with limited training data which is common in fMRI fields. In contrast, we design a hierarchical prompt structure that transfers the knowledge learned from high-resource tasks to low-resource ones. This structure, equipped with a Deeply-conditioned Input-Prompt (DIP) mapping module, allows for efficient adaptation by updating only 2% of the trainable parameters. The framework enhances semantic interpretability through attention mechanisms between inputs and prompts, and it clusters prompts in the latent space in alignment with prior knowledge. Experiments on public resting state fMRI datasets reveal ScaPT outperforms fine-tuning and multitask-based prompt tuning in neurodegenerative diseases diagnosis/prognosis and personality trait prediction, even with fewer than 20 participants. It highlights ScaPT's efficiency in adapting pre-trained fMRI models to low-resource tasks.


Unsupervised Transfer Learning via Adversarial Contrastive Training

arXiv.org Machine Learning

Data representation is a fundamental aspect of machine learning that significantly influences model performance, efficiency, and interpretability Rumelhart et al. (1986); Bengio et al. (2012); LeCun et al. (2015). In the era of deep learning, neural networks have become the primary tools for data representation in computer vision and natural language processing, leveraging their capacity to automatically extract features. For instance, neural networks trained on labeled data can serve as effective feature extractors when the final layer is removed Goodfellow et al. (2016). The core idea of transfer learning is to leverage learned representations from large upstream datasets to enhance the performance of target-specific downstream tasks. A particularly effective paradigm within transfer learning is pretraining followed by fine-tuning, which has gained increasing attention for its demonstrated efficiency in various studies Schroff et al. (2015); Dhillon et al. (2020); Chen et al. (2019, 2020c). During the pretraining phase, a representation is learned using a large, general dataset with annotations, which is then transferred to the target-specific task. In the fine-tuning stage, a relatively simple model is typically trained on the learned representation to address the specific problem at hand. There is a wide variety of transfer learning methods, along with corresponding theoretical guarantees, that have been proposed.


Transfer learning of state-based potential games for process optimization in decentralized manufacturing systems

arXiv.org Artificial Intelligence

This paper presents a novel transfer learning approach in state-based potential games (TL-SbPGs) for enhancing distributed self-optimization in manufacturing systems. The approach focuses on the practical relevant industrial setting where sharing and transferring gained knowledge among similar-behaved players improves the self-learning mechanism in large-scale systems. With TL-SbPGs, the gained knowledge can be reused by other players to optimize their policies, thereby improving the learning outcomes of the players and accelerating the learning process. To accomplish this goal, we develop transfer learning concepts and similarity criteria for players, which offer two distinct settings: (a) predefined similarities between players and (b) dynamically inferred similarities between players during training. We formally prove the applicability of the SbPG framework in transfer learning. Additionally, we introduce an efficient method to determine the optimal timing and weighting of the transfer learning procedure during the training phase. Through experiments on a laboratory-scale testbed, we demonstrate that TL-SbPGs significantly boost production efficiency while reducing power consumption of the production schedules while also outperforming native SbPGs.


A Unified Manifold Similarity Measure Enhancing Few-Shot, Transfer, and Reinforcement Learning in Manifold-Distributed Datasets

arXiv.org Artificial Intelligence

Training a classifier with high mean accuracy from a manifold-distributed dataset can be challenging. This problem is compounded further when there are only few labels available for training. For transfer learning to work, both the source and target datasets must have a similar manifold structure. As part of this study, we present a novel method for determining the similarity between two manifold structures. This method can be used to determine whether the target and source datasets have a similar manifold structure suitable for transfer learning. We then present a few-shot learning method to classify manifold-distributed datasets with limited labels using transfer learning. Based on the base and target datasets, a similarity comparison is made to determine if the two datasets are suitable for transfer learning. A manifold structure and label distribution are learned from the base and target datasets. When the structures are similar, the manifold structure and its relevant label information from the richly labeled source dataset is transferred to target dataset. We use the transferred information, together with the labels and unlabeled data from the target dataset, to develop a few-shot classifier that produces high mean classification accuracy on manifold-distributed datasets. In the final part of this article, we discuss the application of our manifold structure similarity measure to reinforcement learning and image recognition.


Tabular Transfer Learning via Prompting LLMs

arXiv.org Artificial Intelligence

Learning with a limited number of labeled data is a central problem in real-world applications of machine learning, as it is often expensive to obtain annotations. To deal with the scarcity of labeled data, transfer learning is a conventional approach; it suggests to learn a transferable knowledge by training a neural network from multiple other sources. In this paper, we investigate transfer learning of tabular tasks, which has been less studied and successful in the literature, compared to other domains, e.g., vision and language. This is because tables are inherently heterogeneous, i.e., they contain different columns and feature spaces, making transfer learning difficult. On the other hand, recent advances in natural language processing suggest that the label scarcity issue can be mitigated by utilizing in-context learning capability of large language models (LLMs). Inspired by this and the fact that LLMs can also process tables within a unified language space, we ask whether LLMs can be effective for tabular transfer learning, in particular, under the scenarios where the source and target datasets are of different format. As a positive answer, we propose a novel tabular transfer learning framework, coined Prompt to Transfer (P2T), that utilizes unlabeled (or heterogeneous) source data with LLMs. Specifically, P2T identifies a column feature in a source dataset that is strongly correlated with a target task feature to create examples relevant to the target task, thus creating pseudo-demonstrations for prompts. Experimental results demonstrate that P2T outperforms previous methods on various tabular learning benchmarks, showing good promise for the important, yet underexplored tabular transfer learning problem. Code is available at https://github.com/jaehyun513/P2T.


Scaling Law of Sim2Real Transfer Learning in Expanding Computational Materials Databases for Real-World Predictions

arXiv.org Artificial Intelligence

To address the challenge of limited experimental materials data, extensive physical property databases are being developed based on high-throughput computational experiments, such as molecular dynamics simulations. Previous studies have shown that fine-tuning a predictor pretrained on a computational database to a real system can result in models with outstanding generalization capabilities compared to learning from scratch. This study demonstrates the scaling law of simulation-to-real (Sim2Real) transfer learning for several machine learning tasks in materials science. Case studies of three prediction tasks for polymers and inorganic materials reveal that the prediction error on real systems decreases according to a power-law as the size of the computational data increases. Observing the scaling behavior offers various insights for database development, such as determining the sample size necessary to achieve a desired performance, identifying equivalent sample sizes for physical and computational experiments, and guiding the design of data production protocols for downstream real-world tasks.


AdapMTL: Adaptive Pruning Framework for Multitask Learning Model

arXiv.org Artificial Intelligence

In the domain of multimedia and multimodal processing, the efficient handling of diverse data streams such as images, video, and sensor data is paramount. Model compression and multitask learning (MTL) are crucial in this field, offering the potential to address the resource-intensive demands of processing and interpreting multiple forms of media simultaneously. However, effectively compressing a multitask model presents significant challenges due to the complexities of balancing sparsity allocation and accuracy performance across multiple tasks. To tackle these challenges, we propose AdapMTL, an adaptive pruning framework for MTL models. AdapMTL leverages multiple learnable soft thresholds independently assigned to the shared backbone and the task-specific heads to capture the nuances in different components' sensitivity to pruning. During training, it co-optimizes the soft thresholds and MTL model weights to automatically determine the suitable sparsity level at each component to achieve both high task accuracy and high overall sparsity. It further incorporates an adaptive weighting mechanism that dynamically adjusts the importance of task-specific losses based on each task's robustness to pruning. We demonstrate the effectiveness of AdapMTL through comprehensive experiments on popular multitask datasets, namely NYU-v2 and Tiny-Taskonomy, with different architectures, showcasing superior performance compared to state-of-the-art pruning methods.


On the Generalization for Transfer Learning: An Information-Theoretic Analysis

arXiv.org Artificial Intelligence

Transfer learning, or domain adaptation, is concerned with machine learning problems in which training and testing data come from possibly different probability distributions. In this work, we give an information-theoretic analysis of the generalization error and excess risk of transfer learning algorithms. Our results suggest, perhaps as expected, that the Kullback-Leibler (KL) divergence $D(\mu\|\mu')$ plays an important role in the characterizations where $\mu$ and $\mu'$ denote the distribution of the training data and the testing data, respectively. Specifically, we provide generalization error and excess risk upper bounds for learning algorithms where data from both distributions are available in the training phase. Recognizing that the bounds could be sub-optimal in general, we provide improved excess risk upper bounds for a certain class of algorithms, including the empirical risk minimization (ERM) algorithm, by making stronger assumptions through the \textit{central condition}. To demonstrate the usefulness of the bounds, we further extend the analysis to the Gibbs algorithm and the noisy stochastic gradient descent method. We then generalize the mutual information bound with other divergences such as $\phi$-divergence and Wasserstein distance, which may lead to tighter bounds and can handle the case when $\mu$ is not absolutely continuous with respect to $\mu'$. Several numerical results are provided to demonstrate our theoretical findings. Lastly, to address the problem that the bounds are often not directly applicable in practice due to the absence of the distributional knowledge of the data, we develop an algorithm (called InfoBoost) that dynamically adjusts the importance weights for both source and target data based on certain information measures. The empirical results show the effectiveness of the proposed algorithm.


Learning to Select the Best Forecasting Tasks for Clinical Outcome Prediction

arXiv.org Artificial Intelligence

The paradigm of'pretraining' from a set of relevant auxiliary tasks and then'finetuning' on a target task has been successfully applied in many different domains. However, when the auxiliary tasks are abundant, with complex relationships to the target task, using domain knowledge or searching over all possible pretraining setups is inefficient and suboptimal. To address this challenge, we propose a method to automatically select from a large set of auxiliary tasks, which yields a representation most useful to the target task. In particular, we develop an efficient algorithm that uses automatic auxiliary task selection within a nested-loop metalearning process. We have applied this algorithm to the task of clinical outcome predictions in electronic medical records, learning from a large number of selfsupervised tasks related to forecasting patient trajectories. Experiments on a real clinical dataset demonstrate the superior predictive performance of our method compared to direct supervised learning, naive pretraining and simple multitask learning, in particular in low-data scenarios when the primary task has very few examples. With detailed ablation analysis, we further show that the selection rules are interpretable and able to generalize to unseen target tasks with new data.