Transfer Learning
Adaptive Hoeffding Tree with Transfer Learning for Streaming Synchrophasor Data Sets
Mrabet, Zakaria El, Selvaraj, Daisy Flora, Ranganathan, Prakash
Synchrophasor technology or phasor measurement units (PMUs) are known to detect multiple type of oscillations or faults better than Supervisory Control and Data Acquisition (SCADA) systems, but the volume of Bigdata (e.g., 30-120 samples per second on a single PMU) generated by these sensors at the aggregator level (e.g., several PMUs) requires special handling. Conventional machine learning or data mining methods are not suitable to handle such larger streaming realtime data. This is primarily due to latencies associated with cloud environments (e.g., at an aggregator or PDC level), and thus necessitates the need for local computing to move the data on the edge (or locally at the PMU level) for processing. This requires faster real-time streaming algorithms to be processed at the local level (e.g., typically by a Field Programmable Gate Array (FPGA) based controllers). This paper proposes a transfer learning-based hoeffding tree with ADWIN (THAT) method to detect anomalous synchrophasor signatures. The proposed algorithm is trained and tested with the OzaBag method. The preliminary results with transfer learning indicate that a computational time saving of 0.7ms is achieved with THAT algorithm (0.34ms) over Ozabag (1.04ms), while the accuracy of both methods in detecting fault events remains at 94% for four signatures.
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Transfer learning aims to leverage knowledge from pre-trained models to benefit the target task. Prior transfer learning work mainly transfers from a single model. However, with the emergence of deep models pre-trained from different resources, model hubs consisting of diverse models with various architectures, pre-trained datasets and learning paradigms are available. Directly applying single-model transfer learning methods to each model wastes the abundant knowledge of the model hub and suffers from high computational cost. In this paper, we propose a Hub-Pathway framework to enable knowledge transfer from a model hub.
Transfer learning for atomistic simulations using GNNs and kernel mean embeddings
Interatomic potentials learned using machine learning methods have been successfully applied to atomistic simulations. However, accurate models require large training datasets, while generating reference calculations is computationally demanding. To bypass this difficulty, we propose a transfer learning algorithm that leverages the ability of graph neural networks (GNNs) to represent chemical environments together with kernel mean embeddings. We extract a feature map from GNNs pre-trained on the OC20 dataset and use it to learn the potential energy surface from system-specific datasets of catalytic processes. Our method is further enhanced by incorporating into the kernel the chemical species information, resulting in improved performance and interpretability. We test our approach on a series of realistic datasets of increasing complexity, showing excellent generalization and transferability performance, and improving on methods that rely on GNNs or ridge regression alone, as well as similar fine-tuning approaches.
Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors
Deep learning is increasingly moving towards a transfer learning paradigm whereby large foundation models are fine-tuned on downstream tasks, starting from an initialization learned on the source task. But an initialization contains relatively little information about the source task, and does not reflect the belief that our knowledge of the source task should affect the locations and shape of optima on the downstream task.Instead, we show that we can learn highly informative posteriors from the source task, through supervised or self-supervised approaches, which then serve as the basis for priors that modify the whole loss surface on the downstream task. This simple modular approach enables significant performance gains and more data-efficient learning on a variety of downstream classification and segmentation tasks, serving as a drop-in replacement for standard pre-training strategies. These highly informative priors also can be saved for future use, similar to pre-trained weights, and stand in contrast to the zero-mean isotropic uninformative priors that are typically used in Bayesian deep learning.
Zero-shot Transfer Learning within a Heterogeneous Graph via Knowledge Transfer Networks
Data continuously emitted from industrial ecosystems such as social or e-commerce platforms are commonly represented as heterogeneous graphs (HG) composed of multiple node/edge types. State-of-the-art graph learning methods for HGs known as heterogeneous graph neural networks (HGNNs) are applied to learn deep context-informed node representations. However, many HG datasets from industrial applications suffer from label imbalance between node types. As there is no direct way to learn using labels rooted at different node types, HGNNs have been applied to only a few node types with abundant labels. We propose a zero-shot transfer learning module for HGNNs called a Knowledge Transfer Network (KTN) that transfers knowledge from label-abundant node types to zero-labeled node types through rich relational information given in the HG. KTN is derived from the theoretical relationship, which we introduce in this work, between distinct feature extractors for each node type given in an HGNN model.
ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning
Capitalizing on large pre-trained models for various downstream tasks of interest have recently emerged with promising performance. Due to the ever-growing model size, the standard full fine-tuning based task adaptation strategy becomes prohibitively costly in terms of model training and storage. This has led to a new research direction in parameter-efficient transfer learning. However, existing attempts typically focus on downstream tasks from the same modality (e.g., image understanding) of the pre-trained model. This creates a limit because in some specific modalities, (e.g., video understanding) such a strong pre-trained model with sufficient knowledge is less or not available.
Scalable Diverse Model Selection for Accessible Transfer Learning
With the preponderance of pretrained deep learning models available off-the-shelf from model banks today, finding the best weights to fine-tune to your use-case can be a daunting task. Several methods have recently been proposed to find good models for transfer learning, but they either don't scale well to large model banks or don't perform well on the diversity of off-the-shelf models. Ideally the question we want to answer is, "given some data and a source model, can you quickly predict the model's accuracy after fine-tuning?" In this paper, we formalize this setting as "Scalable Diverse Model Selection" and propose several benchmarks for evaluating on this task. We find that existing model selection and transferability estimation methods perform poorly here and analyze why this is the case.
Model-Robust and Adaptive-Optimal Transfer Learning for Tackling Concept Shifts in Nonparametric Regression
Lin, Haotian, Reimherr, Matthew
Nonparametric regression is one of the most extensively studied problems in past decades due to its remarkable flexibility in modeling the relationship between an input X and output Y . While numerous algorithms have been developed, the strong guarantees of learnability and generalization rely on the fact that there are a sufficient number of training samples and that the future data possess the same distribution as the training. However, training sample scarcity in the target domain of interest and distribution shifts occur frequently in practical applications and deteriorate the effectiveness of most existing algorithms both empirically and theoretically. Transfer learning has emerged as an appealing and promising paradigm for addressing these challenges by leveraging samples or pre-trained models from similar, yet not identical, source domains. In this work, we study the problem of transfer learning in the presence of the concept shifts for nonparametric regression over some specific reproducing kernel Hilbert spaces (RKHS). Specifically, we posit there are limited labeled samples from the target domain but sufficient labeled samples from a similar source domain where the concept shifted, namely, the conditional distribution of Y |X changes across domains, which implies the underlying regression function shifts.
BeST -- A Novel Source Selection Metric for Transfer Learning
Soni, Ashutosh, Ju, Peizhong, Eryilmaz, Atilla, Shroff, Ness B.
One of the most fundamental, and yet relatively less explored, goals in transfer learning is the efficient means of selecting top candidates from a large number of previously trained models (optimized for various "source" tasks) that would perform the best for a new "target" task with a limited amount of data. In this paper, we undertake this goal by developing a novel task-similarity metric (BeST) and an associated method that consistently performs well in identifying the most transferrable source(s) for a given task. In particular, our design employs an innovative quantization-level optimization procedure in the context of classification tasks that yields a measure of similarity between a source model and the given target data. The procedure uses a concept similar to early stopping (usually implemented to train deep neural networks (DNNs) to ensure generalization) to derive a function that approximates the transfer learning mapping without training. The advantage of our metric is that it can be quickly computed to identify the top candidate(s) for a given target task before a computationally intensive transfer operation (typically using DNNs) can be implemented between the selected source and the target task. As such, our metric can provide significant computational savings for transfer learning from a selection of a large number of possible source models. Through extensive experimental evaluations, we establish that our metric performs well over different datasets and varying numbers of data samples. Transfer Learning Pan and Yang (2010) Weiss et al. (2016) is a method to increase the efficacy of learning a target task by transferring the knowledge contained in a different but related source task. It is known that the effectiveness of supervised learning depends on the amount of labeled data.
U-Fair: Uncertainty-based Multimodal Multitask Learning for Fairer Depression Detection
Cheong, Jiaee, Bangar, Aditya, Kalkan, Sinan, Gunes, Hatice
We propose accounting for this gender difference in PHQ-8 distributions via U-Fair. Moreover, each gender may display different PHQ-approach towards building relevant ML for healthcare 8 task distribution which may results in different solutions, we propose a novel method, U-Fair, which PHQ-8 distribution and variance. Although investigation accounts for the gender difference in PHQ-8 distribution on the relationship between the PHQ-8 and and leverages on uncertainty as a MTL task gender has been explored in other fields such as psychiatry reweighing mechanism to achieve better gender fairness (Thibodeau and Asmundson, 2014; Vetter for depression detection. Our key contributions et al., 2013; Leung et al., 2020), this has not been investigated are as follow: nor accounted for in any of the existing ML We conduct the first analysis to investigate how for depression detection methods. Moreover, existing MTL impacts fairness in depression detection by work has demonstrated the risk of a fairness-accuracy using each PHQ-8 subcriterion as a task. We trade-off (Pleiss et al., 2017) and how mainstream show that a simplistic baseline MTL approach MTL objectives might not correlate well with fairness runs the risk of incurring negative transfer and goals (Wang et al., 2021b). No work has investigated may not improve on the Pareto frontier. A how a MTL approach impacts performance Pareto frontier can be understood as the set of across fairness for the task of depression detection.