Zhang, Luchan
Energy stable neural network for gradient flow equations
Fan, Ganghua, Jin, Tianyu, Lan, Yuan, Xiang, Yang, Zhang, Luchan
Partial differential equations are important tools in solving a wide range of problems in science and engineering fields. Over the past twenty years, deep neural networks (DNNs) [12, 19] have demonstrated their power in science and engineering applications, and efforts have been made to employ DNNs to solve complex partial differential equations as an alternative to the traditional numerical schemes, especially for problems in high dimensions. Early works [5, 17] use feedforward neural network to learn the initial/boundary value problem by constraining neural networks using differential equation. Methods using continuous dynamical systems to model high-dimensional nonlinear functions used in machine learning were proposed in [6]. A deep learning-based approach to solve high dimensional parabolic partial differential equations (PDEs) based on the formulation of stochastic differential equations was developed in [14].
Large Transformers are Better EEG Learners
Wang, Bingxin, Fu, Xiaowen, Lan, Yuan, Zhang, Luchan, Xiang, Yang
Pre-trained large transformer models have achieved remarkable performance in the fields of natural language processing and computer vision. Since the magnitude of available labeled electroencephalogram (EEG) data is much lower than that of text and image data, it is difficult for transformer models pre-trained from EEG to be developed as large as GPT-4 100T to fully unleash the potential of this architecture. In this paper, we show that transformers pre-trained from images as well as text can be directly fine-tuned for EEG-based prediction tasks. We design AdaCE, plug-and-play Adapters for Converting EEG data into image as well as text forms, to fine-tune pre-trained vision and language transformers. The proposed AdaCE module is highly effective for fine-tuning pre-trained transformers while achieving state-of-the-art performance on diverse EEG-based prediction tasks. For example, AdaCE on the pre-trained Swin-Transformer achieves 99.6%, an absolute improvement of 9.2%, on the EEG-decoding task of human activity recognition (UCI HAR). Furthermore, we empirically show that applying the proposed AdaCE to fine-tune larger pre-trained models can achieve better performance on EEG-based predicting tasks, indicating the potential of our adapters for even larger transformers. The plug-and-play AdaCE module can be applied to fine-tuning most of the popular pre-trained transformers on many other time-series data with multiple channels, not limited to EEG data and the models we use. Our code will be available at https://github.com/wangbxj1234/AdaCE.
Feature Flow Regularization: Improving Structured Sparsity in Deep Neural Networks
Wu, Yue, Lan, Yuan, Zhang, Luchan, Xiang, Yang
Pruning is a model compression method that removes redundant parameters in deep neural networks (DNNs) while maintaining accuracy. Most available filter pruning methods require complex treatments such as iterative pruning, features statistics/ranking, or additional optimization designs in the training process. In this paper, we propose a simple and effective regularization strategy from a new perspective of evolution of features, which we call feature flow regularization (FFR), for improving structured sparsity and filter pruning in DNNs. Specifically, FFR imposes controls on the gradient and curvature of feature flow along the neural network, which implicitly increases the sparsity of the parameters. The principle behind FFR is that coherent and smooth evolution of features will lead to an efficient network that avoids redundant parameters. The high structured sparsity obtained from FFR enables us to prune filters effectively. Experiments with VGGNets, ResNets on CIFAR-10/100, and Tiny ImageNet datasets demonstrate that FFR can significantly improve both unstructured and structured sparsity. Our pruning results in terms of reduction of parameters and FLOPs are comparable to or even better than those of state-of-the-art pruning methods.