FourCastNeXt: Improving FourCastNet Training with Limited Compute
Guo, Edison, Ahmed, Maruf, Sun, Yue, Mahendru, Rahul, Yang, Rui, Cook, Harrison, Leeuwenburg, Tennessee, Evans, Ben
–arXiv.org Artificial Intelligence
Recently, the FourCastNet Neural Earth System Model (NESM) by (Pathak et al., 2022) has shown impressive results on predicting various atmospheric variables, trained on the ERA5 reanalysis dataset. While FourCastNet enjoys quasi-linear time and memory complexity in sequence length compared to quadratic complexity in vanilla transformers, training FourCastNet on ERA5 from scratch still requires large amount of compute resources, which is expensive or even inaccessible to most researchers. In this work, we will show improved methods that can train FourCastNet using only 1% of the compute required by the baseline, while maintaining model performance or par or even better than the baseline. In this report, we will provide technical details of our methodologies along with experimental results and ablation study of different components of our methods. We have called our improved model FourCastNeXt, in a similar spirit to ConvNeXt (Liu et al., 2022).
arXiv.org Artificial Intelligence
Jan-10-2024