stgcn
City-Conditioned Memory for Multi-City Traffic and Mobility Forecasting
Deploying spatio-temporal forecasting models across many cities is difficult: traffic networks differ in size and topology, data availability can vary by orders of magnitude, and new cities may provide only a short history of logs. Existing deep traffic models are typically trained per city and backbone, creating high maintenance cost and poor transfer to data-scarce cities. We ask whether a single, backbone-agnostic layer can condition on "which city this sequence comes from", improve accuracy in full- and low-data regimes, and support better cross-city adaptation with minimal code changes. We propose CityCond, a light-weight city-conditioned memory layer that augments existing spatio-temporal backbones. CityCond combines a city-ID encoder with an optional shared memory bank (CityMem). Given a city index and backbone hidden states, it produces city-conditioned features fused through gated residual connections. We attach CityCond to five representative backbones (GRU, TCN, Transformer, GNN, STGCN) and evaluate three regimes: full-data, low-data, and cross-city few-shot transfer on METR-LA and PEMS-BAY. We also run auxiliary experiments on SIND, a drone-based multi-agent trajectory dataset from a signalized intersection in Tianjin (we focus on pedestrian tracks). Across more than fourteen model variants and three random seeds, CityCond yields consistent improvements, with the largest gains for high-capacity backbones such as Transformers and STGCNs. CityMem reduces Transformer error by roughly one third in full-data settings and brings substantial gains in low-data and cross-city transfer. On SIND, simple city-ID conditioning modestly improves low-data LSTM performance. CityCond can therefore serve as a reusable design pattern for scalable, multi-city forecasting under realistic data constraints.
SUSTeR: Sparse Unstructured Spatio Temporal Reconstruction on Traffic Prediction
Wรถlker, Yannick, Beth, Christian, Renz, Matthias, Biastoch, Arne
Mining spatio-temporal correlation patterns for traffic prediction is a well-studied field. However, most approaches are based on the assumption of the availability of and accessibility to a sufficiently dense data source, which is rather the rare case in reality. Traffic sensors in road networks are generally highly sparse in their distribution: fleet-based traffic sensing is sparse in space but also sparse in time. There are also other traffic application, besides road traffic, like moving objects in the marine space, where observations are sparsely and arbitrarily distributed in space. In this paper, we tackle the problem of traffic prediction on sparse and spatially irregular and non-deterministic traffic observations. We draw a border between imputations and this work as we consider high sparsity rates and no fixed sensor locations. We advance correlation mining methods with a Sparse Unstructured Spatio Temporal Reconstruction (SUSTeR) framework that reconstructs traffic states from sparse non-stationary observations. For the prediction the framework creates a hidden context traffic state which is enriched in a residual fashion with each observation. Such an assimilated hidden traffic state can be used by existing traffic prediction methods to predict future traffic states. We query these states with query locations from the spatial domain.
Improving Pain Classification using Spatio-Temporal Deep Learning Approaches with Facial Expressions
Ridouan, Aafaf, Bohi, Amine, Mourchid, Youssef
Pain management and severity detection are crucial for effective treatment, yet traditional self-reporting methods are subjective and may be unsuitable for non-verbal individuals (people with limited speaking skills). To address this limitation, we explore automated pain detection using facial expressions. Our study leverages deep learning techniques to improve pain assessment by analyzing facial images from the Pain Emotion Faces Database (PEMF). We propose two novel approaches1: (1) a hybrid ConvNeXt model combined with Long Short-Term Memory (LSTM) blocks to analyze video frames and predict pain presence, and (2) a Spatio-Temporal Graph Convolution Network (STGCN) integrated with LSTM to process landmarks from facial images for pain detection. Our work represents the first use of the PEMF dataset for binary pain classification and demonstrates the effectiveness of these models through extensive experimentation. The results highlight the potential of combining spatial and temporal features for enhanced pain detection, offering a promising advancement in objective pain assessment methodologies.
Learning-to-solve unit commitment based on few-shot physics-guided spatial-temporal graph convolution network
Yang, Mei, Liu, Gao Qiu andJunyong, Liu, Kai
This letter proposes a few-shot physics-guided spatial temporal graph convolutional network (FPG-STGCN) to fast solve unit commitment (UC). Firstly, STGCN is tailored to parameterize UC. Then, few-shot physics-guided learning scheme is proposed. It exploits few typical UC solutions yielded via commercial optimizer to escape from local minimum, and leverages the augmented Lagrangian method for constraint satisfaction. To further enable both feasibility and continuous relaxation for integers in learning process, straight-through estimator for Tanh-Sign composition is proposed to fully differentiate the mixed integer solution space. Case study on the IEEE benchmark justifies that, our method bests mainstream learning ways on UC feasibility, and surpasses traditional solver on efficiency.
Adaptive Least Mean Squares Graph Neural Networks and Online Graph Signal Estimation
Yan, Yi, Peng, Changran, Kuruoglu, Ercan Engin
HE online prediction of irregularly structured multivariate signals across both spatial and temporal dimensions of the graph signal, which may not be obtainable in reality, is vital in various real-life applications, including demanding the necessity of methods that require no prior weather prediction [1], brain connectivity analysis [2], [3], knowledge. Graph Neural Network (GNN) has extended the traffic flow monitoring [4], and smart grid system management spatial and spectral GSP techniques to time-invariant machine [5]. The signals gathered in real life are often noisy and have learning tasks, including node classification, link prediction, missing values. When representing the irregularly structured and image classification [20], [21], [22]. Different from the multi-dimensionality in the time-varying signals using graphs, GSP approaches, GNN methods such as Graph Convolutional three challenges need to be addressed to bridge the gap Neural Networks and Graph Attention Networks learn the filter between the online prediction of the time-varying signal and from the given data through backpropagation. Additionally, the the spatial representation in the form of graph topology: non-linear activations found in GNNs are capable of handling reconstructing missing data, denoising noisy observations, and non-linear relationships in the signals, enabling GNNs to solve capturing the time-variation.
BASAR:Black-box Attack on Skeletal Action Recognition
Diao, Yunfeng, Shao, Tianjia, Yang, Yong-Liang, Zhou, Kun, Wang, He
Skeletal motion plays a vital role in human activity recognition as either an independent data source or a complement. The robustness of skeleton-based activity recognizers has been questioned recently, which shows that they are vulnerable to adversarial attacks when the full-knowledge of the recognizer is accessible to the attacker. However, this white-box requirement is overly restrictive in most scenarios and the attack is not truly threatening. In this paper, we show that such threats do exist under black-box settings too. To this end, we propose the first black-box adversarial attack method BASAR. Through BASAR, we show that adversarial attack is not only truly a threat but also can be extremely deceitful, because on-manifold adversarial samples are rather common in skeletal motions, in contrast to the common belief that adversarial samples only exist off-manifold. Through exhaustive evaluation and comparison, we show that BASAR can deliver successful attacks across models, data, and attack modes. Through harsh perceptual studies, we show that it achieves effective yet imperceptible attacks. By analyzing the attack on different activity recognizers, BASAR helps identify the potential causes of their vulnerability and provides insights on what classifiers are likely to be more robust against attack.
Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting
Yu, Bing, Yin, Haoteng, Zhu, Zhanxing
Timely accurate traffic forecast is crucial for urban traffic control and guidance. Due to the high nonlinearity and complexity of traffic flow, traditional methods cannot satisfy the requirements of mid-and-long term prediction tasks and often neglect spatial and temporal dependencies. In this paper, we propose a novel deep learning framework, Spatio-Temporal Graph Convolutional Networks (STGCN), to tackle the time series prediction problem in traffic domain. Instead of applying regular convolutional and recurrent units, we formulate the problem on graphs and build the model with complete convolutional structures, which enable much faster training speed with fewer parameters. Experiments show that our STGCN model effectively captures comprehensive spatio-temporal correlations through modeling multi-scale traffic networks and consistently outperforms state-of-the-art baselines on various real-world traffic datasets.