PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs

Wang, Yunbo, Long, Mingsheng, Wang, Jianmin, Gao, Zhifeng, Yu, Philip S.

Neural Information Processing Systems 

The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical frames, where spatial appearances and temporal variations are two crucial structures. This architecture is enlightened by the idea that spatiotemporal predictive learning should memorize both spatial appearances and temporal variations in a unified memory pool. Concretely, memory states are no longer constrained inside each LSTM unit. Instead, they are allowed to zigzag in two directions: across stacked RNN layers vertically and through all RNN states horizontally. The core of this network is a new Spatiotemporal LSTM (ST-LSTM) unit that extracts and memorizes spatial and temporal representations simultaneously.