st-gcn
Decoding the Stressed Brain with Geometric Machine Learning
Koszut, Sonia, Nallaperuma-Herzberg, Sam, Lio, Pietro
Stress significantly contributes to both mental and physical disorders, yet traditional self-reported questionnaires are inherently subjective. In this study, we introduce a novel framework that employs geometric machine learning to detect stress from raw EEG recordings. Our approach constructs graphs by integrating structural connectivity (derived from electrode spatial arrangement) with functional connectivity from pairwise signal correlations. A spatio-temporal graph convolu-tional network (ST-GCN) processes these graphs to capture spatial and temporal dynamics. Experiments on the SAM-40 dataset show that the ST-GCN outperforms standard machine learning models on all key classification metrics and enhances interpretability, explored through ablation analyses of key channels and brain regions. These results pave the way for more objective and accurate stress detection methods.
Spatio-Temporal Graph Convolutional Networks: Optimised Temporal Architecture
Spatio-Temporal graph convolutional networks were originally introduced with CNNs as temporal blocks for feature extraction. Since then LSTM temporal blocks have been proposed and shown to have promising results. We propose a novel architecture combining both CNN and LSTM temporal blocks and then provide an empirical comparison between our new and the pre-existing models. We provide theoretical arguments for the different temporal blocks and use a multitude of tests across different datasets to assess our hypotheses.
Rehabilitation Exercise Quality Assessment through Supervised Contrastive Learning with Hard and Soft Negatives
Karlov, Mark, Abedi, Ali, Khan, Shehroz S.
Exercise-based rehabilitation programs have proven to be effective in enhancing the quality of life and reducing mortality and rehospitalization rates. AI-driven virtual rehabilitation, which allows patients to independently complete exercises at home, utilizes AI algorithms to analyze exercise data, providing feedback to patients and updating clinicians on their progress. These programs commonly prescribe a variety of exercise types, leading to a distinct challenge in rehabilitation exercise assessment datasets: while abundant in overall training samples, these datasets often have a limited number of samples for each individual exercise type. This disparity hampers the ability of existing approaches to train generalizable models with such a small sample size per exercise. Addressing this issue, our paper introduces a novel supervised contrastive learning framework with hard and soft negative samples that effectively utilizes the entire dataset to train a single model applicable to all exercise types. This model, with a Spatial-Temporal Graph Convolutional Network (ST-GCN) architecture, demonstrated enhanced generalizability across exercises and a decrease in overall complexity. Through extensive experiments on three publicly available rehabilitation exercise assessment datasets, the University of Idaho-Physical Rehabilitation Movement Data (UI-PRMD), IntelliRehabDS (IRDS), and KInematic assessment of MOvement and clinical scores for remote monitoring of physical REhabilitation (KIMORE), our method has shown to surpass existing methods, setting a new benchmark in rehabilitation exercise assessment accuracy.
Gait Identification under Surveillance Environment based on Human Skeleton
Zheng, Xingkai, Li, Xirui, Xu, Ke, Jiang, Xinghao, Sun, Tanfeng
As an emerging biological identification technology, vision-based gait identification is an important research content in biometrics. Most existing gait identification methods extract features from gait videos and identify a probe sample by a query in the gallery. However, video data contains redundant information and can be easily influenced by bagging (BG) and clothing (CL). Since human body skeletons convey essential information about human gaits, a skeleton-based gait identification network is proposed in our project. First, extract skeleton sequences from the video and map them into a gait graph. Then a feature extraction network based on Spatio-Temporal Graph Convolutional Network (ST-GCN) is constructed to learn gait representations. Finally, the probe sample is identified by matching with the most similar piece in the gallery. We tested our method on the CASIA-B dataset. The result shows that our approach is highly adaptive and gets the advanced result in BG, CL conditions, and average.
Spatio-Temporal Graph Scattering Transform
Pan, Chao, Chen, Siheng, Ortega, Antonio
Although spatiotemporal graph neural networks have achieved great empirical success in handling multiple correlated time series, they may be impractical in some real-world scenarios due to a lack of sufficient high-quality training data. Furthermore, spatiotemporal graph neural networks lack theoretical interpretation. To address these issues, we put forth a novel mathematically designed framework to analyze spatiotemporal data. Our proposed spatiotemporal graph scattering transform (ST-GST) extends traditional scattering transforms to the spatiotemporal domain. It performs iterative applications of spatiotemporal graph wavelets and nonlinear activation functions, which can be viewed as a forward pass of spatiotemporal graph convolutional networks without training. Since all the filter coefficients in ST-GST are mathematically designed, it is promising for the real-world scenarios with limited training data, and also allows for a theoretical analysis, which shows that the proposed ST-GST is stable to small perturbations of input signals and structures. Finally, our experiments show that i) ST-GST outperforms spatiotemporal graph convolutional networks by an increase of 35% in accuracy for MSR Action3D dataset; ii) it is better and computationally more efficient to design the transform based on separable spatiotemporal graphs than the joint ones; and iii) the nonlinearity in ST-GST is critical to empirical performance. Processing and learning from spatiotemporal data have received increasing attention recently. Examples include: i) skeleton-based human action recognition based on a sequence of human poses (Liu et al. (2019)), which is critical to human behavior understanding (Borges et al. (2013)), and ii) multi-agent trajectory prediction (Hu et al. (2020)), which is critical to robotics and autonomous driving (Shalev-Shwartz et al. (2016)). A common pattern across these applications is that data evolves in both spatial and temporal domains.
Spatio-Temporal Graph Convolution for Functional MRI Analysis
Gadgil, Soham, Zhao, Qingyu, Adeli, Ehsan, Pfefferbaum, Adolf, Sullivan, Edith V., Pohl, Kilian M.
The BOLD signal of resting-state fMRI (rs-fMRI) records the functional brain connectivity in a rich dynamic spatio-temporal setting. However, existing methods applied to rs-fMRI often fail to consider both spatial and temporal characteristics of the data. They either neglect the functional dependency between different brain regions in a network or discard the information in the temporal dynamics of brain activity. To overcome those shortcomings, we propose to formulate functional connectivity networks within the context of spatio-temporal graphs. We then train a spatio-temporal graph convolutional network (ST-GCN) on short sub-sequences of the BOLD time series to model the non-stationary nature of functional connectivity. We simultaneously learn the graph edge importance within ST-GCN to enable interpretation of functional connectivities contributing to the prediction model. In analyzing the rs-fMRI of the Human Connectome Project (HCP, N=1,091) and the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA, N=773), ST-GCN is significantly more accurate than common approaches in predicting gender and age based on BOLD signals. The matrix recording edge importance localizes brain regions and functional connections with significant aging and sex effects, which are verified by the neuroscience literature.
Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition
de Amorim, Cleison Correia, Macêdo, David, Zanchettin, Cleber
Abstract--The recognition of sign language is a challenging task with an important role in society to facilitate the communication ofdeaf persons. We propose a new approach of Spatial-Temporal Graph Convolutional Network to sign language recognition based on the human skeletal movements. The method uses graphs to capture the signs dynamics in two dimensions, spatial and temporal, considering the complex aspects of the language. Additionally, we present a new dataset of human skeletons for sign language based on ASLLVD to contribute to future related studies. I. INTRODUCTION Sign language is a visual communication skill that enables individuals with different types of hearing impairment to communicate in society. It is the language used by most deaf people in their daily lives and, moreover, it is the symbol of identification between the members of that community and the main force that unites them. The sign language has a very close relationship with the culture of the country or even regions, and for this reason, each nation has its language [1]. According to the World Health Organization, the number of deaf people is about 466 million, and the organization estimates that by 2050 this number exceeds 900 million, which is equivalent to a forecast of 1 in 10 individuals around the world [2].
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition
Yan, Sijie (The Chinese University of Hong Kong) | Xiong, Yuanjun (The Chinese University of Hong Kong) | Lin, Dahua (The Chinese University of Hong Kong)
Dynamics of human body skeletons convey significant information for human action recognition. Conventional approaches for modeling skeletons usually rely on hand-crafted parts or traversal rules, thus resulting in limited expressive power and difficulties of generalization. In this work, we propose a novel model of dynamic skeletons called Spatial-Temporal Graph Convolutional Networks (ST-GCN), which moves beyond the limitations of previous methods by automatically learning both the spatial and temporal patterns from data. This formulation not only leads to greater expressive power but also stronger generalization capability. On two large datasets, Kinetics and NTU-RGBD, it achieves substantial improvements over mainstream methods.