Goto

Collaborating Authors

 predrnn


PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs

Neural Information Processing Systems

The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical frames, where spatial appearances and temporal variations are two crucial structures.


PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs

Neural Information Processing Systems

The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical frames, where spatial appearances and temporal variations are two crucial structures.



AI Guide Dog: Egocentric Path Prediction on Smartphone

Jadhav, Aishwarya, Cao, Jeffery, Shetty, Abhishree, Kumar, Urvashi Priyam, Sharma, Aditi, Sukboontip, Ben, Tamarapalli, Jayant Sravan, Zhang, Jingyi, Koul, Anirudh

arXiv.org Artificial Intelligence

This paper introduces AI Guide Dog (AIGD), a lightweight egocentric navigation assistance system for visually impaired individuals, designed for real-time deployment on smartphones. AIGD addresses key challenges in blind navigation by employing a vision-only, multi-label classification approach to predict directional commands, ensuring safe traversal across diverse environments. We propose a novel technique to enable goal-based outdoor navigation by integrating GPS signals and high-level directions, while also addressing uncertain multi-path predictions for destination-free indoor navigation. Our generalized model is the first navigation assistance system to handle both goal-oriented and exploratory navigation scenarios across indoor and outdoor settings, establishing a new state-of-the-art in blind navigation. We present methods, datasets, evaluations, and deployment insights to encourage further innovations in assistive navigation systems.


Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting

Yan, Chiu-Wai, Foo, Shi Quan, Trinh, Van Hoan, Yeung, Dit-Yan, Wong, Ka-Hing, Wong, Wai-Kin

arXiv.org Artificial Intelligence

Deep learning approaches have been widely adopted for precipitation nowcasting in recent years. Previous studies mainly focus on proposing new model architectures to improve pixel-wise metrics. However, they frequently result in blurry predictions which provide limited utility to forecasting operations. In this work, we propose a new Fourier Amplitude and Correlation Loss (FACL) which consists of two novel loss terms: Fourier Amplitude Loss (FAL) and Fourier Correlation Loss (FCL). FAL regularizes the Fourier amplitude of the model prediction and FCL complements the missing phase information. The two loss terms work together to replace the traditional $L_2$ losses such as MSE and weighted MSE for the spatiotemporal prediction problem on signal-based data. Our method is generic, parameter-free and efficient. Extensive experiments using one synthetic dataset and three radar echo datasets demonstrate that our method improves perceptual metrics and meteorology skill scores, with a small trade-off to pixel-wise accuracy and structural similarity. Moreover, to improve the error margin in meteorological skill scores such as Critical Success Index (CSI) and Fractions Skill Score (FSS), we propose and adopt the Regional Histogram Divergence (RHD), a distance metric that considers the patch-wise similarity between signal-based imagery patterns with tolerance to local transforms. Code is available at https://github.com/argenycw/FACL


Reviews: PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs

Neural Information Processing Systems

Here, the authors focus on the maximum likelihood output, but it would be helpful for comparison with prior work to also report the likelihood. Additionally, the computational complexity is mentioned as an advantage of this model, but no detailed analysis or comparison is performed so its hard to know how this compares computational complexity with prior work. Minor notational suggestion: It might be easier for the reader to follow if you use M instead of C for the cell state in equation 3 so that the connection with equation 4 is clearer.


PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs

Yunbo Wang, Mingsheng Long, Jianmin Wang, Zhifeng Gao, Philip S. Yu

Neural Information Processing Systems

The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical frames, where spatial appearances and temporal variations are two crucial structures.


Enhancing Spatiotemporal Prediction Model using Modular Design and Beyond

Pan, Haoyu, Wu, Hao, Yang, Tan

arXiv.org Artificial Intelligence

Predictive learning uses a known state to generate a future state over a period of time. It is a challenging task to predict spatiotemporal sequence because the spatiotemporal sequence varies both in time and space. The mainstream method is to model spatial and temporal structures at the same time using RNN-based or transformer-based architecture, and then generates future data by using learned experience in the way of auto-regressive. The method of learning spatial and temporal features simultaneously brings a lot of parameters to the model, which makes the model difficult to be convergent. In this paper, a modular design is proposed, which decomposes spatiotemporal sequence model into two modules: a spatial encoder-decoder and a predictor. These two modules can extract spatial features and predict future data respectively. The spatial encoder-decoder maps the data into a latent embedding space and generates data from the latent space while the predictor forecasts future embedding from past. By applying the design to the current research and performing experiments on KTH-Action and MovingMNIST datasets, we both improve computational performance and obtain state-of-the-art results.


Comparing recurrent and convolutional neural networks for predicting wave propagation

Fotiadis, Stathi, Pignatelli, Eduardo, Valencia, Mario Lino, Cantwell, Chris, Storkey, Amos, Bharath, Anil A.

arXiv.org Machine Learning

Dynamical systems can be modelled by partial differential equations and numerical computations are used everywhere in science and engineering. In this work, we investigate the performance of recurrent and convolutional deep neural network architectures to predict the surface waves. The system is governed by the Saint-Venant equations. We improve on the long-term prediction over previous methods while keeping the inference time at a fraction of numerical simulations. We also show that convolutional networks perform at least as well as recurrent networks in this task. Finally, we assess the generalisation capability of each network by extrapolating in longer time-frames and in different physical settings.


PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs

Wang, Yunbo, Long, Mingsheng, Wang, Jianmin, Gao, Zhifeng, Yu, Philip S.

Neural Information Processing Systems

The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical frames, where spatial appearances and temporal variations are two crucial structures. This architecture is enlightened by the idea that spatiotemporal predictive learning should memorize both spatial appearances and temporal variations in a unified memory pool. Concretely, memory states are no longer constrained inside each LSTM unit. Instead, they are allowed to zigzag in two directions: across stacked RNN layers vertically and through all RNN states horizontally. The core of this network is a new Spatiotemporal LSTM (ST-LSTM) unit that extracts and memorizes spatial and temporal representations simultaneously.