McGovern, Amy
Machine Learning Estimation of Maximum Vertical Velocity from Radar
Chase, Randy J., McGovern, Amy, Homeyer, Cameron, Marinescu, Peter, Potvin, Corey
The quantification of storm updrafts remains unavailable for operational forecasting despite their inherent importance to convection and its associated severe weather hazards. Updraft proxies, like overshooting top area from satellite images, have been linked to severe weather hazards but only relate to a limited portion of the total storm updraft. This study investigates if a machine learning model, namely U-Nets, can skillfully retrieve maximum vertical velocity and its areal extent from 3-dimensional gridded radar reflectivity alone. The machine learning model is trained using simulated radar reflectivity and vertical velocity from the National Severe Storm Laboratory's convection permitting Warn on Forecast System (WoFS). A parametric regression technique using the sinh-arcsinh-normal distribution is adapted to run with U-Nets, allowing for both deterministic and probabilistic predictions of maximum vertical velocity. The best models after hyperparameter search provided less than 50% root mean squared error, a coefficient of determination greater than 0.65 and an intersection over union (IoU) of more than 0.45 on the independent test set composed of WoFS data. Beyond the WoFS analysis, a case study was conducted using real radar data and corresponding dual-Doppler analyses of vertical velocity within a supercell. The U-Net consistently underestimates the dual-Doppler updraft speed estimates by 50$\%$. Meanwhile, the area of the 5 and 10 m s^-1 updraft cores show an IoU of 0.25. While the above statistics are not exceptional, the machine learning model enables quick distillation of 3D radar data that is related to the maximum vertical velocity which could be useful in assessing a storm's severe potential.
A Machine Learning Tutorial for Operational Meteorology, Part II: Neural Networks and Deep Learning
Chase, Randy J., Harrison, David R., Lackmann, Gary, McGovern, Amy
Over the past decade the use of machine learning in meteorology has grown rapidly. Specifically neural networks and deep learning have been used at an unprecedented rate. In order to fill the dearth of resources covering neural networks with a meteorological lens, this paper discusses machine learning methods in a plain language format that is targeted for the operational meteorological community. This is the second paper in a pair that aim to serve as a machine learning resource for meteorologists. While the first paper focused on traditional machine learning methods (e.g., random forest), here a broad spectrum of neural networks and deep learning methods are discussed. Specifically this paper covers perceptrons, artificial neural networks, convolutional neural networks and U-networks. Like the part 1 paper, this manuscript discusses the terms associated with neural networks and their training. Then the manuscript provides some intuition behind every method and concludes by showing each method used in a meteorological example of diagnosing thunderstorms from satellite images (e.g., lightning flashes). This paper is accompanied with an open-source code repository to allow readers to explore neural networks using either the dataset provided (which is used in the paper) or as a template for alternate datasets.
Comparing Explanation Methods for Traditional Machine Learning Models Part 2: Quantifying Model Explainability Faithfulness and Improvements with Dimensionality Reduction
Flora, Montgomery, Potvin, Corey, McGovern, Amy, Handler, Shawn
Machine learning (ML) models are becoming increasingly common in the atmospheric science community with a wide range of applications. To enable users to understand what an ML model has learned, ML explainability has become a field of active research. In Part I of this two-part study, we described several explainability methods and demonstrated that feature rankings from different methods can substantially disagree with each other. It is unclear, though, whether the disagreement is overinflated due to some methods being less faithful in assigning importance. Herein, "faithfulness" or "fidelity" refer to the correspondence between the assigned feature importance and the contribution of the feature to model performance. In the present study, we evaluate the faithfulness of feature ranking methods using multiple methods. Given the sensitivity of explanation methods to feature correlations, we also quantify how much explainability faithfulness improves after correlated features are limited. Before dimensionality reduction, the feature relevance methods [e.g., SHAP, LIME, ALE variance, and logistic regression (LR) coefficients] were generally more faithful than the permutation importance methods due to the negative impact of correlated features. Once correlated features were reduced, traditional permutation importance became the most faithful method. In addition, the ranking uncertainty (i.e., the spread in rank assigned to a feature by the different ranking methods) was reduced by a factor of 2-10, and excluding less faithful feature ranking methods reduces it further. This study is one of the first to quantify the improvement in explainability from limiting correlated features and knowing the relative fidelity of different explainability methods.
Global Extreme Heat Forecasting Using Neural Weather Models
Lopez-Gomez, Ignacio, McGovern, Amy, Agrawal, Shreya, Hickey, Jason
Heat waves are projected to increase in frequency and severity with global warming. Improved warning systems would help reduce the associated loss of lives, wildfires, power disruptions, and reduction in crop yields. In this work, we explore the potential for deep learning systems trained on historical data to forecast extreme heat on short, medium and subseasonal timescales. To this purpose, we train a set of neural weather models (NWMs) with convolutional architectures to forecast surface temperature anomalies globally, 1 to 28 days ahead, at $\sim200~\mathrm{km}$ resolution and on the cubed sphere. The NWMs are trained using the ERA5 reanalysis product and a set of candidate loss functions, including the mean squared error and exponential losses targeting extremes. We find that training models to minimize custom losses tailored to emphasize extremes leads to significant skill improvements in the heat wave prediction task, compared to NWMs trained on the mean squared error loss. This improvement is accomplished with almost no skill reduction in the general temperature prediction task, and it can be efficiently realized through transfer learning, by re-training NWMs with the custom losses for a few epochs. In addition, we find that the use of a symmetric exponential loss reduces the smoothing of NWM forecasts with lead time. Our best NWM is able to outperform persistence in a regressive sense for all lead times and temperature anomaly thresholds considered, and shows positive regressive skill compared to the ECMWF subseasonal-to-seasonal control forecast after two weeks.
Comparing Explanation Methods for Traditional Machine Learning Models Part 1: An Overview of Current Methods and Quantifying Their Disagreement
Flora, Montgomery, Potvin, Corey, McGovern, Amy, Handler, Shawn
We demonstrate and visualize different explanation methods, how to interpret them, and provide a complete Python package (scikit-explain) to allow future researchers to explore these products. We also highlight the frequent disagreement between explanation methods for feature rankings and feature effects and provide practical advice for dealing with these disagreements. We used ML models developed for severe weather prediction and sub-freezing road surface temperature prediction to generalize the behavior of the different explanation methods. For feature rankings, there is substantially more agreement on the set of top features (e.g., on average, two methods agree on 6 of the top 10 features) than on specific rankings (on average, two methods only agree on the ranks of 2-3 features in the set of top 10 features). On the other hand, two feature effect curves from different methods are in high agreement as long as the phase space is well sampled. Finally, a lesser-known method, tree interpreter, was found comparable to SHAP for feature effects, and with the widespread use of random forests in geosciences and computational ease of tree interpreter, we recommend it be explored in future research.
A Machine Learning Tutorial for Operational Meteorology, Part I: Traditional Machine Learning
Chase, Randy J., Harrison, David R., Burke, Amanda, Lackmann, Gary M., McGovern, Amy
Recently, the use of machine learning in meteorology has increased greatly. While many machine learning methods are not new, university classes on machine learning are largely unavailable to meteorology students and are not required to become a meteorologist. The lack of formal instruction has contributed to perception that machine learning methods are 'black boxes' and thus end-users are hesitant to apply the machine learning methods in their every day workflow. To reduce the opaqueness of machine learning methods and lower hesitancy towards machine learning in meteorology, this paper provides a survey of some of the most common machine learning methods. A familiar meteorological example is used to contextualize the machine learning methods while also discussing machine learning topics using plain language. The following machine learning methods are demonstrated: linear regression; logistic regression; decision trees; random forest; gradient boosted decision trees; naive Bayes; and support vector machines. Beyond discussing the different methods, the paper also contains discussions on the general machine learning process as well as best practices to enable readers to apply machine learning to their own datasets. Furthermore, all code (in the form of Jupyter notebooks and Google Colaboratory notebooks) used to make the examples in the paper is provided in an effort to catalyse the use of machine learning in meteorology.
The Need for Ethical, Responsible, and Trustworthy Artificial Intelligence for Environmental Sciences
McGovern, Amy, Ebert-Uphoff, Imme, Gagne, David John II, Bostrom, Ann
Given the growing use of Artificial Intelligence (AI) and machine learning (ML) methods across all aspects of environmental sciences, it is imperative that we initiate a discussion about the ethical and responsible use of AI. In fact, much can be learned from other domains where AI was introduced, often with the best of intentions, yet often led to unintended societal consequences, such as hard coding racial bias in the criminal justice system or increasing economic inequality through the financial system. A common misconception is that the environmental sciences are immune to such unintended consequences when AI is being used, as most data come from observations, and AI algorithms are based on mathematical formulas, which are often seen as objective. In this article, we argue the opposite can be the case. Using specific examples, we demonstrate many ways in which the use of AI can introduce similar consequences in the environmental sciences. This article will stimulate discussion and research efforts in this direction. As a community, we should avoid repeating any foreseeable mistakes made in other domains through the introduction of AI. In fact, with proper precautions, AI can be a great tool to help {\it reduce} climate and environmental injustice. We primarily focus on weather and climate examples but the conclusions apply broadly across the environmental sciences.
A Summary of the Twenty-Ninth AAAI Conference on Artificial Intelligence
Morris, Robert (NASA) | Bonet, Blai (Universidad Simón Bolívar) | Cavazza, Marc (Teesside University) | desJardins, Marie (University of Maryland, Baltimore County) | Felner, Ariel (BenGurion University) | Hawes, Nick (University of Birmingham) | Knox, Brad (Massachusetts Institute of Technology) | Koenig, Sven (University of Southern California) | Konidaris, George (Massachusetts Institute of Technology,) | Lang, Jérôme ((Université ParisDauphine) | López, Carlos Linares (Universidad Carlos III de Madrid) | Magazzeni, Daniele (King's College London) | McGovern, Amy (University of Oklahoma) | Natarajan, Sriraam (Indiana University) | Sturtevant, Nathan R. (University of Denver,) | Thielscher, Michael (University New South Wales) | Yeoh, William (New Mexico State University) | Sardina, Sebastian (RMIT University) | Wagstaff, Kiri (Jet Propulsion Laboratory)
The Twenty-Ninth AAAI Conference on Artificial Intelligence, (AAAI-15) was held in January 2015 in Austin, Texas (USA) The conference program was cochaired by Sven Koenig and Blai Bonet. This report contains reflective summaries of the main conference, the robotics program, the AI and robotics workshop, the virtual agent exhibition, the what's hot track, the competition panel, the senior member track, student and outreach activities, the student abstract and poster program, the doctoral consortium, the women's mentoring event, and the demonstrations program.
A Summary of the Twenty-Ninth AAAI Conference on Artificial Intelligence
Morris, Robert (NASA) | Bonet, Blai (Universidad Simón Bolívar) | Cavazza, Marc (Teesside University) | desJardins, Marie (University of Maryland, Baltimore County) | Felner, Ariel (BenGurion University) | Hawes, Nick (University of Birmingham) | Knox, Brad (Massachusetts Institute of Technology) | Koenig, Sven (University of Southern California) | Konidaris, George (Massachusetts Institute of Technology,) | Lang, Jérôme ((Université ParisDauphine) | López, Carlos Linares (Universidad Carlos III de Madrid) | Magazzeni, Daniele (King's College London) | McGovern, Amy (University of Oklahoma) | Natarajan, Sriraam (Indiana University) | Sturtevant, Nathan R. (University of Denver,) | Thielscher, Michael (University New South Wales) | Yeoh, William (New Mexico State University) | Sardina, Sebastian (RMIT University) | Wagstaff, Kiri (Jet Propulsion Laboratory)
The AAAI-15 organizing committee of about 60 researchers arranged many of the traditional AAAI events, including the Innovative Applications of Artificial Intelligence (IAAI) Conference, tutorials, workshops, the video competition, senior member summary talks (on well-developed bodies of research or important new research areas), and What's Hot talks (on research trends observed in other AIrelated conferences and, for the first time, competitions). Innovations of AAAI-15 included software and hardware demonstration programs, a virtual agent exhibition, a computer-game showcase, a funding information session with program directors from different funding agencies, and Blue Sky Idea talks (on visions intended to stimulate new directions in AI research) with awards funded by the CRA Computing Community Consortium. Seven invited talks surveyed AI research in academia and industry and its impact on society. Attendees kept track of the program through a smartphone app as well as social media channels.
Day-Ahead Hail Prediction Integrating Machine Learning with Storm-Scale Numerical Weather Models
II, David John Gagne (University of Oklahoma) | McGovern, Amy (University of Oklahoma) | Brotzge, Jerald (University of Albany) | Coniglio, Michael (NOAA National Severe Storms Laboratory) | Jr., James Correia (NOAA Storm Prediction Center, NOAA/OU Cooperative Institute for Mesoscale Meteorological Studies) | Xue, Ming (University of Oklahoma)
Hail causes billions of dollars in losses by damaging buildings, vehicles, and crops. Improving the spatial and temporal accuracy of hail forecasts would allow people to mitigate hail damage. We have developed an approach to forecasting hail that identifies potential hail storms in storm-scale numerical weather prediction models and matches them with observed hailstorms. Machine learning models, including random forests, gradient boosting trees, and linear regression, are used to predict the expected hail size from each forecast storm. The individual hail size forecasts are merged with a spatial neighborhood ensemble probability technique to produce a consensus probability of hail at least 25.4 mm in diameter. The system was evaluated during the 2014 National Oceanic and Atmospheric Administration Hazardous Weather Testbed Experimental Forecast Program and compared with a physics-based hail size model. The machine-learning-based technique shows advantages in producing smaller size errors and more reliable probability forecasts. The machine learning approaches correctly predicted the location and extent of a significant hail event in eastern Nebraska and a marginal severe hail event in Colorado.