CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer

Liu, Yang, Zheng, Zinan, Cheng, Jiashun, Tsung, Fugee, Zhao, Deli, Rong, Yu, Li, Jia

arXiv.org Artificial Intelligence 

Accurate Subseasonal-to-Seasonal (S2S) climate forecasting is pivotal for decision-making including agriculture planning and disaster preparedness but is known to be challenging due to its chaotic nature. Although recent data-driven models have shown promising results, their performance is limited by inadequate consideration of geometric inductive biases. Usually, they treat the spherical weather data as planar images, resulting in an inaccurate representation of locations and spatial relations. In this work, we propose the geometric-inspired Circular Transformer (CirT) to model the cyclic characteristic of the graticule, consisting of two key designs: (1) Decomposing the weather data by latitude into circular patches that serve as input tokens to the Transformer; (2) Leveraging Fourier transform in self-attention to capture the global information and model the spatial periodicity. Extensive experiments on the Earth Reanalysis 5 (ERA5) re-analysis dataset demonstrate our model yields a significant improvement over the advanced data-driven models, including PanguWeather and GraphCast, as well as skillful ECMWF systems. Additionally, we empirically show the effectiveness of our model designs and high-quality prediction over spatial and temporal dimensions. The code link is: https://github.com/compasszzn/CirT . Subseasonal-to-seasonal (S2S) forecasting, which predicts meteorological variables 2 to 6 weeks in advance, is crucial for agriculture, resource allocation, and disaster preparedness (e.g., heatwaves and droughts) (Mouatadid et al., 2024). Despite its high socioeconomic benefits, such a task has long been considered a "predictability desert" (Vitart et al., 2012) due to the chaotic nature of the atmosphere. Compared with medium-range (up to 15 days) and seasonal predictions (3-6 months) (Vitart et al., 2017), the S2S timescale is long enough to lose much of the memory of atmospheric initial conditions, while it is too short for slowly evolving earth system components such as the ocean that strongly influence the atmosphere (Black et al., 2017; Phakula et al., 2024).