GeoMAE: Masking Representation Learning for Spatio-Temporal Graph Forecasting with Missing Values

Ke, Songyu, Wu, Chenyu, Liang, Yuxuan, Qin, Huiling, Zhang, Junbo, Zheng, Yu

arXiv.org Artificial Intelligence 

The ubiquity of missing data in urban intelligence systems, attributable to adverse environmental conditions and equipment failures, poses a significant challenge to the efficacy of downstream applications, notably in the realms of traffic forecasting and energy consumption prediction. Therefore, it is imperative to develop a robust spatio-temporal learning methodology capable of extracting meaningful insights from incomplete datasets. Despite the existence of methodologies for spatio-temporal graph forecasting in the presence of missing values, unresolved issues persist. Primarily, the majority of extant research is predicated on time-series analysis, thereby neglecting the dynamic spatial correlations inherent in sensor networks. Junbo Zhang is the corresponding author. This research was done when the first author was an intern at JD Intelligent Cities Research & JD iCity under the supervision of the fifth author. The model is comprised of three principal components: an input preprocessing module, an attention-based spatio-temporal forecasting network (STAFN), and an auxiliary learning task, which draws inspiration from Masking AutoEncoders to enhance the robustness of spatio-temporal representation learning. Empirical evaluations on real-world datasets demonstrate that GeoMAE significantly outperforms existing benchmarks, achieving up to 13.20% relative improvement over the best baseline models. Introduction Spatio-temporal representation learning has emerged as a pivotal research area, underpinning various intelligent applications in smart cities that play crucial roles across multiple domains. For instance, precise weather forecasting can significantly mitigate the detrimental impacts of natural disasters through early prevention; advanced traffic prediction systems help optimize traffic flow and substantially reduce congestion; environmental monitoring enables rapid identification of pollution hotspots within urban environments.