convlstm
Attention in Convolutional LSTM for Gesture Recognition
Convolutional long short-term memory (LSTM) networks have been widely used for action/gesture recognition, and different attention mechanisms have also been embedded into the LSTM or the convolutional LSTM (ConvLSTM) networks. Based on the previous gesture recognition architectures which combine the three-dimensional convolution neural network (3DCNN) and ConvLSTM, this paper explores the effects of attention mechanism in ConvLSTM. Several variants of ConvLSTM are evaluated: (a) Removing the convolutional structures of the three gates in ConvLSTM, (b) Applying the attention mechanism on the input of ConvLSTM, (c) Reconstructing the input and (d) output gates respectively with the modified channel-wise attention mechanism. The evaluation results demonstrate that the spatial convolutions in the three gates scarcely contribute to the spatiotemporal feature fusion, and the attention mechanisms embedded into the input and output gates cannot improve the feature fusion. In other words, ConvLSTM mainly contributes to the temporal fusion along with the recurrent steps to learn the long-term spatiotemporal features, when taking as input the spatial or spatiotemporal features. On this basis, a new variant of LSTM is derived, in which the convolutional structures are only embedded into the input-to-state transition of LSTM. The code of the LSTM variants is publicly available.
- North America > Canada > Quebec > Montreal (0.14)
- North America > Saint Martin (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (13 more...)
- Government (0.92)
- Energy (0.68)
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- North America > Canada (0.04)
- (2 more...)
Convolutional State Space Models for Long-Range Spatiotemporal Modeling
Effectively modeling long spatiotemporal sequences is challenging due to the need to model complex spatial correlations and long-range temporal dependencies simultaneously. ConvLSTMs attempt to address this by updating tensor-valued states with recurrent neural networks, but their sequential computation makes them slow to train. In contrast, Transformers can process an entire spatiotemporal sequence, compressed into tokens, in parallel.
Axial-UNet: A Neural Weather Model for Precipitation Nowcasting
Mamtani, Sumit, Sonawane, Maitreya
Accurately predicting short-term precipitation is critical for weather-sensitive applications such as disaster management, aviation, and urban planning. Traditional numerical weather prediction can be computationally intensive at high resolution and short lead times. In this work, we propose a lightweight UNet-based encoder-decoder augmented with axial-attention blocks that attend along image rows and columns to capture long-range spatial interactions, while temporal context is provided by conditioning on multiple past radar frames. Our hybrid architecture captures both local and long-range spatio-temporal dependencies from radar image sequences, enabling fixed lead-time precipitation nowcasting with modest compute. Experimental results on a preprocessed subset of the HKO-7 radar dataset demonstrate that our model outperforms ConvLSTM, pix2pix-style cGANs, and a plain UNet in pixel-fidelity metrics, reaching PSNR 47.67 and SSIM 0.9943. We report PSNR/SSIM here; extending evaluation to meteorology-oriented skill measures (e.g., CSI/FSS) is left to future work. The approach is simple, scalable, and effective for resource-constrained, real-time forecasting scenarios.
- North America > United States > New York (0.04)
- Asia > China > Hong Kong (0.04)
- Europe > Netherlands (0.04)
- Europe > France (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Oceania > Australia > Western Australia (0.04)
- Oceania > Australia > Queensland (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > China (0.04)