Transfer Learning
ITL-LIME: Instance-Based Transfer Learning for Enhancing Local Explanations in Low-Resource Data Settings
Raza, Rehan, Wang, Guanjin, Wong, Kok Wai, Laga, Hamid, Fisichella, Marco
Explainable Artificial Intelligence (XAI) methods, such as Local Interpretable Model-Agnostic Explanations (LIME), have advanced the interpretability of black-box machine learning models by approximating their behavior locally using interpretable surrogate models. However, LIME's inherent randomness in perturbation and sampling can lead to locality and instability issues, especially in scenarios with limited training data. In such cases, data scarcity can result in the generation of unrealistic variations and samples that deviate from the true data manifold. Consequently, the surrogate model may fail to accurately approximate the complex decision boundary of the original model. To address these challenges, we propose a novel Instance-based Transfer Learning LIME framework (ITL-LIME) that enhances explanation fidelity and stability in data-constrained environments. ITL-LIME introduces instance transfer learning into the LIME framework by leveraging relevant real instances from a related source domain to aid the explanation process in the target domain. Specifically, we employ clustering to partition the source domain into clusters with representative prototypes. Instead of generating random perturbations, our method retrieves pertinent real source instances from the source cluster whose prototype is most similar to the target instance. These are then combined with the target instance's neighboring real instances. To define a compact locality, we further construct a contrastive learning-based encoder as a weighting mechanism to assign weights to the instances from the combined set based on their proximity to the target instance. Finally, these weighted source and target instances are used to train the surrogate model for explanation purposes.
Transfer Learning for Neutrino Scattering: Domain Adaptation with GANs
Bonilla, Jose L., Graczyk, Krzysztof M., Ankowski, Artur M., Banerjee, Rwik Dharmapal, Kowal, Beata E., Prasad, Hemant, Sobczyk, Jan T.
Significant experimental efforts have been devoted to studying (anti)neutrino-nucleus interactions [1, 2] in the energy range relevant for next-generation neutrino oscillation experiments, such as Hyper-Kamiokande [3] and DUNE [4]. In parallel, theoretical models describing these interactions have been developed [2]. The outcomes of both experimental and theoretical advances are incorporated into Monte Carlo (MC) event generators, which simulate (anti)neutrino-nucleus collisions under realistic conditions [5-10]. MC generators are often tuned to reproduce experimental observations, relying on adjustable parameters that are fitted using available data [11]. However, this tuning process cannot fully compensate for the fundamental limitations of the underlying models, especially those relying on complex approximations, such as nuclear modeling. Consequently, there is a growing interest in alternative approaches to traditional MC event generation--methods that can learn directly from experimental data and dynamically refine their predictions.
SEDEG:Sequential Enhancement of Decoder and Encoder's Generality for Class Incremental Learning with Small Memory
Chen, Hongyang, Pu, Shaoling, Zheng, Lingyu, Sun, Zhongwu
In incremental learning, enhancing the generality of knowledge is crucial for adapting to dynamic data inputs. It can develop generalized representations or more balanced decision boundaries, preventing the degradation of long-term knowledge over time and thus mitigating catastrophic forgetting. Some emerging incremental learning methods adopt an encoder-decoder architecture and have achieved promising results. In the encoder-decoder achitecture, improving the generalization capabilities of both the encoder and decoder is critical, as it helps preserve previously learned knowledge while ensuring adaptability and robustness to new, diverse data inputs. However, many existing continual methods focus solely on enhancing one of the two components, which limits their effectiveness in mitigating catastrophic forgetting. And these methods perform even worse in small-memory scenarios, where only a limited number of historical samples can be stored. To mitigate this limitation, we introduces SEDEG, a two-stage training framework for vision transformers (ViT), focusing on sequentially improving the generality of both Decoder and Encoder. Initially, SEDEG trains an ensembled encoder through feature boosting to learn generalized representations, which subsequently enhance the decoder's generality and balance the classifier. The next stage involves using knowledge distillation (KD) strategies to compress the ensembled encoder and develop a new, more generalized encoder. This involves using a balanced KD approach and feature KD for effective knowledge transfer. Extensive experiments on three benchmark datasets show SEDEG's superior performance, and ablation studies confirm the efficacy of its components. The code is available at https://github.com/ShaolingPu/CIL.
Inductive transfer learning from regression to classification in ECG analysis
Jayasundara, Ridma, Fernando, Ishan, Fernando, Adeepa, Ragel, Roshan, Thambawita, Vajira, Nawinne, Isuru
Cardiovascular diseases (CVDs) are the leading cause of mortality worldwide, accounting for over 30% of global deaths according to the World Health Organization (WHO). Importantly, one-third of these deaths are preventable with timely and accurate diagnosis. The electrocardiogram (ECG), a non-invasive method for recording the electrical activity of the heart, is crucial for diagnosing CVDs. However, privacy concerns surrounding the use of patient ECG data in research have spurred interest in synthetic data, which preserves the statistical properties of real data without compromising patient confidentiality. This study explores the potential of synthetic ECG data for training deep learning models from regression to classification tasks and evaluates the feasibility of transfer learning to enhance classification performance on real ECG data. We experimented with popular deep learning models to predict four key cardiac parameters, namely, Heart Rate (HR), PR interval, QT interval, and QRS complex-using separate regression models. Subsequently, we leveraged these regression models for transfer learning to perform 5-class ECG signal classification. Our experiments systematically investigate whether transfer learning from regression to classification is viable, enabling better utilization of diverse open-access and synthetic ECG datasets. Our findings demonstrate that transfer learning from regression to classification improves classification performance, highlighting its potential to maximize the utility of available data and advance deep learning applications in this domain.
Scalable Diverse Model Selection for Accessible Transfer Learning Supplemental Material (Appendix)
We display full results for all methods here. This means that the source feature quality doesn't matter nearly as much. Since source feature quality is the only metric these methods use to predict transfer performance, they do poorly here. In Tab. 1, we display the Pearson Correlation for each target dataset individually. We also include results for additional baselines and skews of existing methods.