Rethinking deep learning: linear regression remains a key benchmark in predicting terrestrial water storage

Nie, Wanshu, Kumar, Sujay V., Chen, Junyu, Zhao, Long, Skulovich, Olya, Yoo, Jinwoong, Pflug, Justin, Ahmad, Shahryar Khalique, Konapala, Goutam

arXiv.org Artificial Intelligence 

Key Points: We compare linear regression, LSTM, and Transformer models for predicting terrestrial water storage at basin scale over the globe. Linear regression remains a robust benchmark, outperforming LSTM and Transformer models in various tasks. Traditional statistical models and global datasets that capture human and natural impacts are essential for deep learning model evaluation. 2 Abstract Recent advances in machine learning such as Long Short - Term Memory (LSTM) models and Transformers have been widely adopted in hydrological applications, demonstrating impressive performance amongst deep learning models and outperforming physical models in various tasks. However, their superiority in predicting land surface states such as terrestrial water storage (TWS) that are dominated by many factors such as natural variability and human driven modifications remains unclear. Here, using the open - access, globally representative HydroGlobe dataset - comprising a baseline version derived solely from a land surface model simulation and an advanced version incorporating multi - source remote sensing data assimilation - we show that linear regres sion is a robust benchmark, outperforming the more complex LSTM and Temporal Fusion Transformer for TWS prediction. Our findings highlight the importance of including traditional statistical models as benchmarks when developing and evaluating deep learning models. Additionally, we emphasize the critical need to establish globally representative benchmark datasets that capture the combined impact of natural variability and human interventions. Plain Language Summary Recent progress in machine learning has led to the widespread use of deep learning models in studying land freshwater systems, but it remains uncertain if they're always the best tools for such applications . In this study, we use a new, global dataset called HydroGlobe to test different data - driven models. Surprisingly, we find that a basic linear regression model -- one of the simplest tools -- actually performs better than more complex models like LSTM and Transformers in predicting land water storage. Our resu lts suggest that researchers should always compare deep learning models against simpler traditional statistical benchmarks, and that having high - quality, global datasets that include both natural and human effects is crucial for building better deep learning models. 1 Introduction Terrestrial water storage (TWS) is a key indicator of the world's freshwater availability, encompassing all forms of water stored on and beneath the land surface, including soil moisture, groundwater, surface water, and snow. As a fundamental component of the global hydrological cycle, accurate TWS estimates are essential for applications related to preserving ecosystems, supporting agriculture, and ensuring water and food security for livelihoods.