Southern Ocean
Robustness of AI-based weather forecasts in a changing climate
Rackow, Thomas, Koldunov, Nikolay, Lessig, Christian, Sandu, Irina, Alexe, Mihai, Chantry, Matthew, Clare, Mariana, Dramsch, Jesper, Pappenberger, Florian, Pedruzo-Bagazgoitia, Xabier, Tietsche, Steffen, Jung, Thomas
Data-driven machine learning models for weather forecasting have made transformational progress in the last 1-2 years, with state-of-the-art ones now outperforming the best physics-based models for a wide range of skill scores. Given the strong links between weather and climate modelling, this raises the question whether machine learning models could also revolutionize climate science, for example by informing mitigation and adaptation to climate change or to generate larger ensembles for more robust uncertainty estimates. Here, we show that current state-of-the-art machine learning models trained for weather forecasting in present-day climate produce skillful forecasts across different climate states corresponding to pre-industrial, present-day, and future 2.9K warmer climates. This indicates that the dynamics shaping the weather on short timescales may not differ fundamentally in a changing climate. It also demonstrates out-of-distribution generalization capabilities of the machine learning models that are a critical prerequisite for climate applications. Nonetheless, two of the models show a global-mean cold bias in the forecasts for the future warmer climate state, i.e. they drift towards the colder present-day climate they have been trained for. A similar result is obtained for the pre-industrial case where two out of three models show a warming. We discuss possible remedies for these biases and analyze their spatial distribution, revealing complex warming and cooling patterns that are partly related to missing ocean-sea ice and land surface information in the training data. Despite these current limitations, our results suggest that data-driven machine learning models will provide powerful tools for climate science and transform established approaches by complementing conventional physics-based models.
Meet Pesto, the 49-pound baby penguin going viral online
Sea Life Melbourne Aquarium celebrates their star penguin, Pesto, who weighs a whopping 49 pounds. PENGUIN-INSPIRED ROBOT EXPLORES SEA USING AI Pesto weighs more than both his proud parents combined at a staggering 49 pounds. His parents, Hudson and Tango, each weigh about 24 pounds. According to a statement from the Sea Life Melbourne Aquarium, Pesto is the heaviest chick the facility has ever had. His gender was announced to the world earlier this month when his keeper, Michaela Smale, "shovel[ed] away a mountain of fresh snow to unleash an avalanche of blue."
Antarctica's 'Doomsday Glacier' is on the verge of COLLAPSING: Huge ice sheet the size of Great Britain could cause global sea levels to rise by 2 FEET, study warns
The suspect in Charlie Kirk's assassination has been captured, FBI director Kash Patel announced MSNBC sparks outrage for'disgusting' Charlie Kirk comments following Utah shooting Tragedy as Charlie Kirk's wife left behind with two young children after conservative activist is fatally shot A DEI mayor, an inconvenient crime and video they never wanted you to see: MAUREEN CALLAHAN knows why the Left has sympathy for that killer... but none for his victim Sweater weather starts here - the cozy, chic pieces from Soft Surroundings you'll actually wear all season We only had one symptom we dismissed... but then we were diagnosed with the rarest form of melanoma Soft-touch prosecutor let felon walk free... before crook'slit Auburn professor's throat in random attack' I tried the 30 cent'miracle chill pill' before a big event.. now I'm taking it for everything Donald Trump and House Republicans lead prayers for Charlie Kirk's family after conservative star is fatally shot Prince Harry says his father King Charles is'great' following their first meeting in 19 months... which was over a cup of tea and just 55 minutes long Liberal media defends thug who killed Ukrainian woman in cold blood: 'This man was hurting' Knifeman accused of stabbing Ukrainian refugee to death gives chilling reason for the attack... as he speaks for the first time from jail on the murder that shocked America Fox News reveals new lineup and elevates star White House reporter who's sparred with Trump Horrific new details of passenger injuries after they were'thrown' around Delta flight during'severe turbulence' Antarctica's'Doomsday Glacier' is on the verge of COLLAPSING: Huge ice sheet the size of Great Britain could cause global sea levels to rise by 2 FEET, study warns READ MORE: 'Doomsday Glacier' melting'much faster' than previously thought With the potential to cause sea levels across the planet to rise, it's no wonder the Thwaites Glacier has earned the nickname the'Doomsday Glacier.' Now, scientists have revealed concerning findings about how and when the glacier could collapse. Researchers from the British Antarctic Survey (BAS) used underwater robots to take new measurements of the glacier, which is the same size as Great Britain. The data indicates that the Thwaites Glacier and much of the West Antarctic Ice Sheet could be lost entirely by the 23rd century. Worryingly, if it collapses entirely, the experts say global sea levels would rise by two feet (65cm) - plunging huge areas underwater. With the potential to cause seas across the planet to rise, it's no wonder the Thwaites Glacier has earned the nickname the'Doomsday Glacier' The Thwaites Glacier is roughly 74.5 miles (120km) across - the same size as Great Britain or Florida - making it the widest glacier on the planet Ice shelf connected to Antarctic's doomsday glacier is CRACKING The Thwaites Glacier is roughly 74.5 miles (120km) across - the same size as Great Britain or Florida.
Prithvi WxC: Foundation Model for Weather and Climate
Schmude, Johannes, Roy, Sujit, Trojak, Will, Jakubik, Johannes, Civitarese, Daniel Salles, Singh, Shraddha, Kuehnert, Julian, Ankur, Kumar, Gupta, Aman, Phillips, Christopher E, Kienzler, Romeo, Szwarcman, Daniela, Gaur, Vishal, Shinde, Rajat, Lal, Rohit, Da Silva, Arlindo, Diaz, Jorge Luis Guevara, Jones, Anne, Pfreundschuh, Simon, Lin, Amy, Sheshadri, Aditi, Nair, Udaysankar, Anantharaj, Valentine, Hamann, Hendrik, Watson, Campbell, Maskey, Manil, Lee, Tsengdar J, Moreno, Juan Bernabe, Ramachandran, Rahul
Triggered by the realization that AI emulators can rival the performance of traditional numerical weather prediction models running on HPC systems, there is now an increasing number of large AI models that address use cases such as forecasting, downscaling, or nowcasting. While the parallel developments in the AI literature focus on foundation models -- models that can be effectively tuned to address multiple, different use cases -- the developments on the weather and climate side largely focus on single-use cases with particular emphasis on mid-range forecasting. We close this gap by introducing Prithvi WxC, a 2.3 billion parameter foundation model developed using 160 variables from the Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). Prithvi WxC employs an encoder-decoder-based architecture, incorporating concepts from various recent transformer models to effectively capture both regional and global dependencies in the input data. The model has been designed to accommodate large token counts to model weather phenomena in different topologies at fine resolutions. Furthermore, it is trained with a mixed objective that combines the paradigms of masked reconstruction with forecasting. We test the model on a set of challenging downstream tasks namely: Autoregressive rollout forecasting, Downscaling, Gravity wave flux parameterization, and Extreme events estimation. The pretrained model with 2.3 billion parameters, along with the associated fine-tuning workflows, has been publicly released as an open-source contribution via Hugging Face.
Generative Diffusion Model-based Downscaling of Observed Sea Surface Height over Kuroshio Extension since 2000
Han, Qiuchang, Jiang, Xingliang, Zhao, Yang, Wang, Xudong, Li, Zhijin, Zhang, Renhe
Satellite altimetry has been widely utilized to monitor global sea surface dynamics, enabling investigation of upper ocean variability from basin-scale to localized eddy ranges. However, the sparse spatial resolution of observational altimetry limits our understanding of oceanic submesoscale variability, prevalent at horizontal scales below 0.25o resolution. Here, we introduce a state-of-the-art generative diffusion model to train high-resolution sea surface height (SSH) reanalysis data and demonstrate its advantage in observational SSH downscaling over the eddy-rich Kuroshio Extension region. The diffusion-based model effectively downscales raw satellite-interpolated data from 0.25o resolution to 1/16o, corresponding to approximately 12-km wavelength. This model outperforms other high-resolution reanalysis datasets and neural network-based methods. Also, it successfully reproduces the spatial patterns and power spectra of satellite along-track observations. Our diffusion-based results indicate that eddy kinetic energy at horizontal scales less than 250 km has intensified significantly since 2004 in the Kuroshio Extension region. These findings underscore the great potential of deep learning in reconstructing satellite altimetry and enhancing our understanding of ocean dynamics at eddy scales.
A Comparative Analysis of Faithfulness Metrics and Humans in Citation Evaluation
Zhang, Weijia, Aliannejadi, Mohammad, Pei, Jiahuan, Yuan, Yifei, Huang, Jia-Hong, Kanoulas, Evangelos
Large language models (LLMs) often generate content with unsupported or unverifiable content, known as "hallucinations." To address this, retrieval-augmented LLMs are employed to include citations in their content, grounding the content in verifiable sources. Despite such developments, manually assessing how well a citation supports the associated statement remains a major challenge. Previous studies tackle this challenge by leveraging faithfulness metrics to estimate citation support automatically. However, they limit this citation support estimation to a binary classification scenario, neglecting fine-grained citation support in practical scenarios. To investigate the effectiveness of faithfulness metrics in fine-grained scenarios, we propose a comparative evaluation framework that assesses the metric effectiveness in distinguishing citations between three-category support levels: full, partial, and no support. Our framework employs correlation analysis, classification evaluation, and retrieval evaluation to measure the alignment between metric scores and human judgments comprehensively. Our results indicate no single metric consistently excels across all evaluations, highlighting the complexity of accurately evaluating fine-grained support levels. Particularly, we find that the best-performing metrics struggle to distinguish partial support from full or no support. Based on these findings, we provide practical recommendations for developing more effective metrics.
The impact of internal variability on benchmarking deep learning climate emulators
Lรผtjens, Bjรถrn, Ferrari, Raffaele, Watson-Parris, Duncan, Selin, Noelle
Full-complexity Earth system models (ESMs) are computationally very expensive, limiting their use in exploring the climate outcomes of multiple emission pathways. More efficient emulators that approximate ESMs can directly map emissions onto climate outcomes, and benchmarks are being used to evaluate their accuracy on standardized tasks and datasets. We investigate a popular benchmark in data-driven climate emulation, ClimateBench, on which deep learning-based emulators are currently achieving the best performance. We implement a linear regression-based emulator, akin to pattern scaling, and find that it outperforms the incumbent 100M-parameter deep learning foundation model, ClimaX, on 3 out of 4 regionally-resolved surface-level climate variables. While emulating surface temperature is expected to be predominantly linear, this result is surprising for emulating precipitation. We identify that this outcome is a result of high levels of internal variability in the benchmark targets. To address internal variability, we update the benchmark targets with ensemble averages from the MPI-ESM1.2-LR model that contain 50 instead of 3 climate simulations per emission pathway. Using the new targets, we show that linear pattern scaling continues to be more accurate on temperature, but can be outperformed by a deep learning-based model for emulating precipitation. We publish our code, data, and an interactive tutorial at github.com/blutjens/climate-emulator.
From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty
Ivgi, Maor, Yoran, Ori, Berant, Jonathan, Geva, Mor
Large language models (LLMs) often exhibit undesirable behaviors, such as hallucinations and sequence repetitions. We propose to view these behaviors as fallbacks that models exhibit under uncertainty, and investigate the connection between them. We categorize fallback behaviors -- sequence repetitions, degenerate text, and hallucinations -- and extensively analyze them in models from the same family that differ by the amount of pretraining tokens, parameter count, or the inclusion of instruction-following training. Our experiments reveal a clear and consistent ordering of fallback behaviors, across all these axes: the more advanced an LLM is (i.e., trained on more tokens, has more parameters, or instruction-tuned), its fallback behavior shifts from sequence repetitions, to degenerate text, and then to hallucinations. Moreover, the same ordering is observed throughout a single generation, even for the best-performing models; as uncertainty increases, models shift from generating hallucinations to producing degenerate text and then sequence repetitions. Lastly, we demonstrate that while common decoding techniques, such as random sampling, might alleviate some unwanted behaviors like sequence repetitions, they increase harder-to-detect hallucinations.
Enforcing Equity in Neural Climate Emulators
Neural network emulators have become an invaluable tool for a wide variety of climate and weather prediction tasks. While showing incredibly promising results, these networks do not have an inherent ability to produce equitable predictions. That is, they are not guaranteed to provide a uniform quality of prediction along any particular class or group of people. This potential for inequitable predictions motivates the need for explicit representations of fairness in these neural networks. To that end, we draw on methods for enforcing analytical physical constraints in neural networks to bias networks towards more equitable predictions. We demonstrate the promise of this methodology using the task of climate model emulation. Specifically, we propose a custom loss function which punishes emulators with unequal quality of predictions across any prespecified regions or category, here defined using human development index (HDI). This loss function weighs a standard loss metric such as mean squared error against another metric which captures inequity along the equity category (HDI), allowing us to adjust the priority of each term before training. Importantly, the loss function does not specify a particular definition of equity to bias the neural network towards, opening the door for custom fairness metrics. Our results show that neural climate emulators trained with our loss function provide more equitable predictions and that the equity metric improves with greater weighting in the loss function. We empirically demonstrate that while there is a tradeoff between accuracy and equity when prioritizing the latter during training, an appropriate selection of the equity priority hyperparameter can minimize loss of performance.
Graph Neural Networks for Emulation of Finite-Element Ice Dynamics in Greenland and Antarctic Ice Sheets
Koo, Younghyun, Rahnemoonfar, Maryam
Although numerical models provide accurate solutions for ice sheet dynamics based on physics laws, they accompany intensified computational demands to solve partial differential equations. In recent years, convolutional neural networks (CNNs) have been widely used as statistical emulators for those numerical models. However, since CNNs operate on regular grids, they cannot represent the refined meshes and computational efficiency of finite-element numerical models. Therefore, instead of CNNs, this study adopts an equivariant graph convolutional network (EGCN) as an emulator for the ice sheet dynamics modeling. EGCN reproduces ice thickness and velocity changes in the Helheim Glacier, Greenland, and Pine Island Glacier, Antarctica, with 260 times and 44 times faster computation time, respectively. Compared to the traditional CNN and graph convolutional network, EGCN shows outstanding accuracy in thickness prediction near fast ice streams by preserving the equivariance to the translation and rotation of graphs.