Islam, Md. Rabiul
Masked Autoencoder with Swin Transformer Network for Mitigating Electrode Shift in HD-EMG-based Gesture Recognition
Laamerad, Kasra, Shabanpour, Mehran, Islam, Md. Rabiul, Mohammadi, Arash
Multi-channel surface Electromyography (sEMG), also referred to as high-density sEMG (HD-sEMG), plays a crucial role in improving gesture recognition performance for myoelectric control. Pattern recognition models developed based on HD-sEMG, however, are vulnerable to changing recording conditions (e.g., signal variability due to electrode shift). This has resulted in significant degradation in performance across subjects, and sessions. In this context, the paper proposes the Masked Autoencoder with Swin Transformer (MAST) framework, where training is performed on a masked subset of HDsEMG channels. A combination of four masking strategies, i.e., random block masking; temporal masking; sensor-wise random masking, and; multi-scale masking, is used to learn latent representations and increase robustness against electrode shift. The masked data is then passed through MAST's three-path encoder-decoder structure, leveraging a multi-path Swin-Unet architecture that simultaneously captures time-domain, frequency-domain, and magnitude-based features of the underlying HD-sEMG signal. These augmented inputs are then used in a self-supervised pre-training fashion to improve the model's generalization capabilities. Experimental results demonstrate the superior performance of the proposed MAST framework in comparison to its counterparts.
Surface EMG-Based Inter-Session/Inter-Subject Gesture Recognition by Leveraging Lightweight All-ConvNet and Transfer Learning
Islam, Md. Rabiul, Massicotte, Daniel, Massicotte, Philippe Y., Zhu, Wei-Ping
Gesture recognition using low-resolution instantaneous HD-sEMG images opens up new avenues for the development of more fluid and natural muscle-computer interfaces. However, the data variability between inter-session and inter-subject scenarios presents a great challenge. The existing approaches employed very large and complex deep ConvNet or 2SRNN-based domain adaptation methods to approximate the distribution shift caused by these inter-session and inter-subject data variability. Hence, these methods also require learning over millions of training parameters and a large pre-trained and target domain dataset in both the pre-training and adaptation stages. As a result, it makes high-end resource-bounded and computationally very expensive for deployment in real-time applications. To overcome this problem, we propose a lightweight All-ConvNet+TL model that leverages lightweight All-ConvNet and transfer learning (TL) for the enhancement of inter-session and inter-subject gesture recognition performance. The All-ConvNet+TL model consists solely of convolutional layers, a simple yet efficient framework for learning invariant and discriminative representations to address the distribution shifts caused by inter-session and inter-subject data variability. Experiments on four datasets demonstrate that our proposed methods outperform the most complex existing approaches by a large margin and achieve state-of-the-art results on inter-session and inter-subject scenarios and perform on par or competitively on intra-session gesture recognition. These performance gaps increase even more when a tiny amount (e.g., a single trial) of data is available on the target domain for adaptation. These outstanding experimental results provide evidence that the current state-of-the-art models may be overparameterized for sEMG-based inter-session and inter-subject gesture recognition tasks.
S-ConvNet: A Shallow Convolutional Neural Network Architecture for Neuromuscular Activity Recognition Using Instantaneous High-Density Surface EMG Images
Islam, Md. Rabiul, Massicotte, Daniel, Nougarou, Francois, Massicotte, Philippe, Zhu, Wei-Ping
The concept of neuromuscular activity recognition using instantaneous high-density surface electromyography (HD-sEMG) images opens up new avenues for the development of more fluid and natural muscle-computer interfaces. However, the existing approaches employed a very large deep convolutional neural network (ConvNet) architecture and complex training schemes for HD-sEMG image recognition, which requires the network architecture to be pre-trained on a very large-scale labeled training dataset, as a result, it makes computationally very expensive. To overcome this problem, we propose S-ConvNet and All-ConvNet models, a simple yet efficient framework for learning instantaneous HD-sEMG images from scratch for neuromuscular activity recognition. Without using any pre-trained models, our proposed S-ConvNet and All-ConvNet demonstrate very competitive recognition accuracy to the more complex state of the art for neuromuscular activity recognition based on instantaneous HD-sEMG images, while using a ~ 12 x smaller dataset and reducing learning parameters to a large extent. The experimental results proved that the S-ConvNet and All-ConvNet are highly effective for learning discriminative features for instantaneous HD-sEMG image recognition especially in the data and high-end resource constrained scenarios.