Wang, Shiwei
Revisiting Training-Inference Trigger Intensity in Backdoor Attacks
Lin, Chenhao, Zhao, Chenyang, Wang, Shiwei, Wang, Longtian, Shen, Chao, Zhao, Zhengyu
Backdoor attacks typically place a specific trigger on certain training data, such that the model makes prediction errors on inputs with that trigger during inference. Despite the core role of the trigger, existing studies have commonly believed a perfect match between training-inference triggers is optimal. In this paper, for the first time, we systematically explore the training-inference trigger relation, particularly focusing on their mismatch, based on a Training-Inference Trigger Intensity Manipulation (TITIM) workflow. TITIM specifically investigates the training-inference trigger intensity, such as the size or the opacity of a trigger, and reveals new insights into trigger generalization and overfitting. These new insights challenge the above common belief by demonstrating that the training-inference trigger mismatch can facilitate attacks in two practical scenarios, posing more significant security threats than previously thought. First, when the inference trigger is fixed, using training triggers with mixed intensities leads to stronger attacks than using any single intensity. For example, on CIFAR-10 with ResNet-18, mixing training triggers with 1.0 and 0.1 opacities improves the worst-case attack success rate (ASR) (over different testing opacities) of the best single-opacity attack from 10.61\% to 92.77\%. Second, intentionally using certain mismatched training-inference triggers can improve the attack stealthiness, i.e., better bypassing defenses. For example, compared to the training/inference intensity of 1.0/1.0, using 1.0/0.7 decreases the area under the curve (AUC) of the Scale-Up defense from 0.96 to 0.62, while maintaining a high attack ASR (99.65\% vs. 91.62\%). The above new insights are validated to be generalizable across different backdoor attacks, models, datasets, tasks, and (digital/physical) domains.
E-Sort: Empowering End-to-end Neural Network for Multi-channel Spike Sorting with Transfer Learning and Fast Post-processing
Han, Yuntao, Wang, Shiwei
Decoding extracellular recordings is a crucial task in electrophysiology and brain-computer interfaces. Spike sorting, which distinguishes spikes and their putative neurons from extracellular recordings, becomes computationally demanding with the increasing number of channels in modern neural probes. To address the intensive workload and complex neuron interactions, we propose E-Sort, an end-to-end neural network-based spike sorter with transfer learning and parallelizable post-processing. Our framework reduces the required number of annotated spikes for training by 44% compared to training from scratch, achieving up to 25.68% higher accuracy. Additionally, our novel post-processing algorithm is compatible with deep learning frameworks, making E-Sort significantly faster than state-of-the-art spike sorters. On synthesized Neuropixels recordings, E-Sort achieves comparable accuracy with Kilosort4 while sorting 50 seconds of data in only 1.32 seconds. Our method demonstrates robustness across various probe geometries, noise levels, and drift conditions, offering a substantial improvement in both accuracy and runtime efficiency compared to existing spike sorters.
Knowledge-Reuse Transfer Learning Methods in Molecular and Material Science
Chen, An, Wang, Zhilong, Vidaurre, Karl Luigi Loza, Han, Yanqiang, Ye, Simin, Tao, Kehao, Wang, Shiwei, Gao, Jing, Li, Jinjin
Molecules and materials are the foundation for the development of modern advanced industries such as energy storage systems and semiconductor devices. However, traditional trial-and-error methods or theoretical calculations are highly resource-intensive, and extremely long R&D (Research and Development) periods cannot meet the urgent need for molecules/materials in industrial development. Machine learning (ML) methods based on big data are expected to break this dilemma. However, the difficulty in constructing large-scale datasets of new molecules/materials due to the high cost of data acquisition and annotation limits the development of machine learning. The application of transfer learning lowers the data requirements for model training, which makes transfer learning stand out in researches addressing data quality issues. In this review, we summarize recent advances in transfer learning related to molecular and materials science. We focus on the application of transfer learning methods for the discovery of advanced molecules/materials, particularly, the construction of transfer learning frameworks for different systems, and how transfer learning can enhance the performance of models. In addition, the challenges of transfer learning are also discussed.
CILIATE: Towards Fairer Class-based Incremental Learning by Dataset and Training Refinement
Gao, Xuanqi, Zhai, Juan, Ma, Shiqing, Shen, Chao, Chen, Yufei, Wang, Shiwei
Due to the model aging problem, Deep Neural Networks (DNNs) need updates to adjust them to new data distributions. The common practice leverages incremental learning (IL), e.g., Class-based Incremental Learning (CIL) that updates output labels, to update the model with new data and a limited number of old data. This avoids heavyweight training (from scratch) using conventional methods and saves storage space by reducing the number of old data to store. But it also leads to poor performance in fairness. In this paper, we show that CIL suffers both dataset and algorithm bias problems, and existing solutions can only partially solve the problem. We propose a novel framework, CILIATE, that fixes both dataset and algorithm bias in CIL. It features a novel differential analysis guided dataset and training refinement process that identifies unique and important samples overlooked by existing CIL and enforces the model to learn from them. Through this process, CILIATE improves the fairness of CIL by 17.03%, 22.46%, and 31.79% compared to state-of-the-art methods, iCaRL, BiC, and WA, respectively, based on our evaluation on three popular datasets and widely used ResNet models.