hydrologic model
Methods to improve run time of hydrologic models: opportunities and challenges in the machine learning era
The application of Machine Learning (ML) to hydrologic modeling is fledgling. Its applicability to capture the dependencies on watersheds to forecast better within a short period is fascinating. One of the key reasons to adopt ML algorithms over physics-based models is its computational efficiency advantage and flexibility to work with various data sets. The diverse applications, particularly in emergency response and expanding over a large scale, demand the hydrological model in a short time and make researchers adopt data-driven modeling approaches unhesitatingly. In this work, in the era of ML and deep learning (DL), how it can help to improve the overall run time of physics-based model and potential constraints that should be addressed while modeling. This paper covers the opportunities and challenges of adopting ML for hydrological modeling and subsequently how it can help to improve the simulation time of physics-based models and future works that should be addressed.
- North America > United States > Alabama > Tuscaloosa County > Tuscaloosa (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Asia > China > Jiangxi Province > Nanchang (0.04)
Differentiable, learnable, regionalized process-based models with physical outputs can approach state-of-the-art hydrologic prediction accuracy
Feng, Dapeng, Liu, Jiangtao, Lawson, Kathryn, Shen, Chaopeng
Predictions of hydrologic variables across the entire water cycle have significant value for water resource management as well as downstream applications such as ecosystem and water quality modeling. Recently, purely data-driven deep learning models like long short-term memory (LSTM) showed seemingly-insurmountable performance in modeling rainfall-runoff and other geoscientific variables, yet they cannot predict untrained physical variables and remain challenging to interpret. Here we show that differentiable, learnable, process-based models (called {\delta} models here) can approach the performance level of LSTM for the intensively-observed variable (streamflow) with regionalized parameterization. We use a simple hydrologic model HBV as the backbone and use embedded neural networks, which can only be trained in a differentiable programming framework, to parameterize, enhance, or replace the process-based model modules. Without using an ensemble or post-processor, {\delta} models can obtain a median Nash Sutcliffe efficiency of 0.732 for 671 basins across the USA for the Daymet forcing dataset, compared to 0.748 from a state-of-the-art LSTM model with the same setup. For another forcing dataset, the difference is even smaller: 0.715 vs. 0.722. Meanwhile, the resulting learnable process-based models can output a full set of untrained variables, e.g., soil and groundwater storage, snowpack, evapotranspiration, and baseflow, and later be constrained by their observations. Both simulated evapotranspiration and fraction of discharge from baseflow agreed decently with alternative estimates. The general framework can work with models with various process complexity and opens up the path for learning physics from big data.
- Government > Regional Government > North America Government > United States Government (1.00)
- Energy > Oil & Gas > Upstream (0.67)
- Water & Waste Management > Water Management (0.66)
- Education > Curriculum > Subject-Specific Education (0.66)
Machine Learning for Postprocessing Ensemble Streamflow Forecasts
Sharma, Sanjib, Ghimire, Ganesh Raj, Siddique, Ridwan
Skillful streamflow forecasts can inform decisions in various areas of water policy and management. We integrate numerical weather prediction ensembles, distributed hydrological model and machine learning to generate ensemble streamflow forecasts at medium-range lead times (1 - 7 days). We demonstrate a case study for machine learning applications in postprocessing ensemble streamflow forecasts in the Upper Susquehanna River basin in the eastern United States. Our results show that the machine learning postprocessor can improve streamflow forecasts relative to low complexity forecasts (e.g., climatological and temporal persistence) as well as standalone hydrometeorological modeling and neural network. The relative gain in forecast skill from postprocessor is generally higher at medium-range timescales compared to shorter lead times; high flows compared to low-moderate flows, and warm-season compared to cool ones. Overall, our results highlight the benefits of machine learning in many aspects for improving both the skill and reliability of streamflow forecasts.
- Oceania > Australia (0.28)
- North America > United States > Texas (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (7 more...)
Enhancing streamflow forecast and extracting insights using long-short term memory networks with data integration at continental scales
Feng, Dapeng, Fang, Kuai, Shen, Chaopeng
Recent observations with varied schedules and types (moving average, snapshot, or regularly spaced) can help to improve streamflow forecast but it is difficult to effectively integrate them. Based on a long short-term memory (LSTM) streamflow model, we tested different formulations in a flexible method we call data integration (DI) to integrate recently discharge measurements to improve forecast. DI accepts lagged inputs either directly or through a convolutional neural network (CNN) unit. DI can ubiquitously elevate streamflow forecast performance to unseen levels, reaching a continental-scale median Nash-Sutcliffe coefficient of 0.86. Integrating moving-average discharge, discharge from a few days ago, or even average discharge of the last calendar month could all improve daily forecast. It turned out, directly using lagged observations as inputs was comparable in performance to using the CNN unit. Importantly, we obtained valuable insights regarding hydrologic processes impacting LSTM and DI performance. Before applying DI, the original LSTM worked well in mountainous regions and snow-dominated regions, but less so in regions with low discharge volumes (due to either low precipitation or high precipitation-energy synchronicity) and large inter-annual storage variability. DI was most beneficial in regions with high flow autocorrelation: it greatly reduced baseflow bias in groundwater-dominated western basins; it also improved the peaks for basins with dynamical surface water storage, e.g., the Prairie Potholes or Great Lakes regions. However, even DI cannot help high-aridity basins with one-day flash peaks. There is much promise with a deep-learning-based forecast paradigm due to its performance, automation, efficiency, and flexibility.
Accurate Hydrologic Modeling Using Less Information
Shalev, Guy, El-Yaniv, Ran, Klotz, Daniel, Kratzert, Frederik, Metzger, Asher, Nevo, Sella
Joint models are a common and important tool in the intersect ion of machine learning and the physical sciences, particularly in contex ts where real-world measurements are scarce. Recent developments in rainfall-run off modeling, one of the prime challenges in hydrology, show the value of a joint m odel with shared representation in this important context. However, curren t state-of-the-art models depend on detailed and reliable attributes characteriz ing each site to help the model differentiate correctly between the behavior of diff erent sites. This dependency can present a challenge in data-poor regions. In this p aper, we show that we can replace the need for such location-specific attributes w ith a completely data-driven learned embedding, and match previous state-of-the -art results with less information.
- North America > United States (0.47)
- South America > Chile (0.05)
- Oceania > Australia (0.04)
- (6 more...)
ML for Flood Forecasting at Scale
Nevo, Sella, Anisimov, Vova, Elidan, Gal, El-Yaniv, Ran, Giencke, Pete, Gigi, Yotam, Hassidim, Avinatan, Moshe, Zach, Schlesinger, Mor, Shalev, Guy, Tirumali, Ajai, Wiesel, Ami, Zlydenko, Oleg, Matias, Yossi
Effective riverine flood forecasting at scale is hindered by a multitude of factors, most notably the need to rely on human calibration in current methodology, the limited amount of data for a specific location, and the computational difficulty of building continent/global level models that are sufficiently accurate. Machine learning (ML) is primed to be useful in this scenario: learned models often surpass human experts in complex high-dimensional scenarios, and the framework of transfer or multitask learning is an appealing solution for leveraging local signals to achieve improved global performance. We propose to build on these strengths and develop ML systems for timely and accurate riverine flood prediction. Floods are the most common and deadly natural disaster in the world. Every year, floods cause from thousands to tens of thousands of fatalities [1, 22, 2, 21, 14], affect hundreds of millions of people [14, 21, 2], and cause tens of billions of dollars worth of damages [1, 2]. These numbers have only been increasing in recent decades [23]. Indeed, the UN charter notes floods to be one of the key motivators for formulating the sustainable development goals (SDGs), and directly challenges us: "They knew that earthquakes and floods were inevitable, but that the high death tolls were not."