Goto

Collaborating Authors

 reanalysis


Sigsoftmax: Reanalysis of the Softmax Bottleneck

Neural Information Processing Systems

Softmax is an output activation function for modeling categorical probability distributions in many applications of deep learning. However, a recent study revealed that softmax can be a bottleneck of representational capacity of neural networks in language modeling (the softmax bottleneck). In this paper, we propose an output activation function for breaking the softmax bottleneck without additional parameters. We re-analyze the softmax bottleneck from the perspective of the output set of log-softmax and identify the cause of the softmax bottleneck. On the basis of this analysis, we propose sigsoftmax, which is composed of a multiplication of an exponential function and sigmoid function. Sigsoftmax can break the softmax bottleneck. The experiments on language modeling demonstrate that sigsoftmax and mixture of sigsoftmax outperform softmax and mixture of softmax, respectively.


Appa: Bending Weather Dynamics with Latent Diffusion Models for Global Data Assimilation

Andry, Gérôme, Lewin, Sacha, Rozet, François, Rochman, Omer, Mangeleer, Victor, Pirlet, Matthias, Faulx, Elise, Grégoire, Marilaure, Louppe, Gilles

arXiv.org Artificial Intelligence

Deep learning has advanced weather forecasting, but accurate predictions first require identifying the current state of the atmosphere from observational data. In this work, we introduce Appa, a score-based data assimilation model generating global atmospheric trajectories at 0.25\si{\degree} resolution and 1-hour intervals. Powered by a 565M-parameter latent diffusion model trained on ERA5, Appa can be conditioned on arbitrary observations to infer plausible trajectories, without retraining. Our probabilistic framework handles reanalysis, filtering, and forecasting, within a single model, producing physically consistent reconstructions from various inputs. Results establish latent score-based data assimilation as a promising foundation for future global atmospheric modeling systems.


Beyond Resolution: Multi-Scale Weather and Climate Data for Alpine Renewable Energy in the Digital Twin Era -- First Evaluations and Recommendations

Schicker, Irene, Bügelmayer-Blaschek, Marianne, Lexer, Annemarie, Baier, Katharina, Hasel, Kristofer, Gazzaneo, Paolo

arXiv.org Artificial Intelligence

When Austrian hydropower produc null on plummeted by 44% in early 2025 due to reduced snowpack, it exposed a cri null cal vulnerability: standard meteorological and climatological datasets systema null cally fail in mountain region s that hold untapped renewable poten null al. This perspec null ves paper evaluates emerging solu null ons to the Alpine energy -climate data gap, analyzing datasets from global reanalyses (ERA5, 31 km) to kilometre-scale Digital Twins (Climate DT, Extremes DT, 4.4 km), regional reanalyses (ARA, 2.5 km), and next-genera null on AI weather predic null on models (AIFS, 31 km). The mul null - resolu null on assessment reveals that no single dataset excels universally: coarse reanalyses provide essen null al climatologies but miss valley-scale processes, while Digital Twins resolve Alpine dynamics yet remain computa null onally demanding. Effec null ve energy planning therefore requires strategic dataset combina null ons validated against energy -relevant indices such as popula null on -weighted extremes, wind-gust return periods, and Alpine-adjusted storm thresholds. A key fron null er is sub -hourly (10-15 min) temporal resolu null on to match grid - opera null on needs. Six evidence - based recommenda null ons outline pathways f or bridging spa null al and temporal scales. As renewable deployment expands globally into complex terrain, the Alpine region offers transferable perspec null ves for tackling iden null cal forecas null ng and climate analysis challenges in mountainous regions worldwide.


OceanAI: A Conversational Platform for Accurate, Transparent, Near-Real-Time Oceanographic Insights

Chen, Bowen, Gajbhar, Jayesh, Dusek, Gregory, Redmon, Rob, Hogan, Patrick, Liu, Paul, Bohnenstiehl, DelWayne, Xu, Dongkuan, He, Ruoying

arXiv.org Artificial Intelligence

Artificial intelligence is transforming the sciences, yet general conversational AI systems often generate unverified "hallucinations" undermining scientific rigor. We present OceanAI, a conversational platform that integrates the natural-language fluency of open-source large language models (LLMs) with real-time, parameterized access to authoritative oceanographic data streams hosted by the National Oceanic and Atmospheric Administration (NOAA). Each query such as "What was Boston Harbor's highest water level in 2024?" triggers real-time API calls that identify, parse, and synthesize relevant datasets into reproducible natural-language responses and data visualizations. In a blind comparison with three widely used AI chat-interface products, only OceanAI produced NOAA-sourced values with original data references; others either declined to answer or provided unsupported results. Designed for extensibility, OceanAI connects to multiple NOAA data products and variables, supporting applications in marine hazard forecasting, ecosystem assessment, and water-quality monitoring. By grounding outputs and verifiable observations, OceanAI advances transparency, reproducibility, and trust, offering a scalable framework for AI-enabled decision support within the oceans. A public demonstration is available at https://oceanai.ai4ocean.xyz.


Learning Coupled Earth System Dynamics with GraphDOP

Boucher, Eulalie, Alexe, Mihai, Lean, Peter, Pinnington, Ewan, Lang, Simon, Laloyaux, Patrick, Zampieri, Lorenzo, de Rosnay, Patricia, Bormann, Niels, McNally, Anthony

arXiv.org Artificial Intelligence

Interactions between different components of the Earth System (e.g. ocean, atmosphere, land and cryosphere) are a crucial driver of global weather patterns. Modern Numerical Weather Prediction (NWP) systems typically run separate models of the different components, explicitly coupled across their interfaces to additionally model exchanges between the different components. Accurately representing these coupled interactions remains a major scientific and technical challenge of weather forecasting. GraphDOP is a graph-based machine learning model that learns to forecast weather directly from raw satellite and in-situ observations, without reliance on reanalysis products or traditional physics-based NWP models. GraphDOP simultaneously embeds information from diverse observation sources spanning the full Earth system into a shared latent space. This enables predictions that implicitly capture cross-domain interactions in a single model without the need for any explicit coupling. Here we present a selection of case studies which illustrate the capability of GraphDOP to forecast events where coupled processes play a particularly key role. These include rapid sea-ice freezing in the Arctic, mixing-induced ocean surface cooling during Hurricane Ian and the severe European heat wave of 2022. The results suggest that learning directly from Earth System observations can successfully characterise and propagate cross-component interactions, offering a promising path towards physically consistent end-to-end data-driven Earth System prediction with a single model.


Climate Knowledge in Large Language Models

Kuznetsov, Ivan, Grassi, Jacopo, Pantiukhin, Dmitrii, Shapkin, Boris, Jung, Thomas, Koldunov, Nikolay

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly deployed for climate-related applications, where understanding internal climatological knowledge is crucial for reliability and misinformation risk assessment. Despite growing adoption, the capacity of LLMs to recall climate normals from parametric knowledge remains largely uncharacterized. We investigate the capacity of contemporary LLMs to recall climate normals without external retrieval, focusing on a prototypical query: mean July 2-m air temperature 1991-2020 at specified locations. We construct a global grid of queries at 1° resolution land points, providing coordinates and location descriptors, and validate responses against ERA5 reanalysis. Results show that LLMs encode non-trivial climate structure, capturing latitudinal and topographic patterns, with root-mean-square errors of 3-6 °C and biases of $\pm$1 °C. However, spatially coherent errors remain, particularly in mountains and high latitudes. Performance degrades sharply above 1500 m, where RMSE reaches 5-13 °C compared to 2-4 °C at lower elevations. We find that including geographic context (country, city, region) reduces errors by 27% on average, with larger models being most sensitive to location descriptors. While models capture the global mean magnitude of observed warming between 1950-1974 and 2000-2024, they fail to reproduce spatial patterns of temperature change, which directly relate to assessing climate change. This limitation highlights that while LLMs may capture present-day climate distributions, they struggle to represent the regional and local expression of long-term shifts in temperature essential for understanding climate dynamics. Our evaluation framework provides a reproducible benchmark for quantifying parametric climate knowledge in LLMs and complements existing climate communication assessments.


Incorporating Multivariate Consistency in ML-Based Weather Forecasting with Latent-space Constraints

Fan, Hang, Xiao, Yi, Qu, Yongquan, Ling, Fenghua, Fei, Ben, Bai, Lei, Gentine, Pierre

arXiv.org Artificial Intelligence

Data-driven machine learning (ML) models have recently shown promise in surpassing traditional physics-based approaches for weather forecasting, leading to a so-called second revolution in weather forecasting. However, most ML-based forecast models treat reanalysis as the truth and are trained under variable-specific loss weighting, ignoring their physical coupling and spatial structure. Over long time horizons, the forecasts become blurry and physically unrealistic under rollout training. To address this, we reinterpret model training as a weak-constraint four-dimensional variational data assimilation (WC-4DVar) problem, treating reanalysis data as imperfect observations. This allows the loss function to incorporate reanalysis error covariance and capture multivariate dependencies. In practice, we compute the loss in a latent space learned by an autoencoder (AE), where the reanalysis error covariance becomes approximately diagonal, thus avoiding the need to explicitly model it in the high-dimensional model space. We show that rollout training with latent-space constraints improves long-term forecast skill and better preserves fine-scale structures and physical realism compared to training with model-space loss. Finally, we extend this framework to accommodate heterogeneous data sources, enabling the forecast model to be trained jointly on reanalysis and multi-source observations within a unified theoretical formulation.


A comparison of stretched-grid and limited-area modelling for data-driven regional weather forecasting

Wijnands, Jasper S., Van Ginderachter, Michiel, François, Bastien, Buurman, Sophie, Termonia, Piet, Bleeken, Dieter Van den

arXiv.org Artificial Intelligence

Regional machine learning weather prediction (MLWP) models based on graph neural networks have recently demonstrated remarkable predictive accuracy, outperforming numerical weather prediction models at lower computational costs. In particular, limited-area model (LAM) and stretched-grid model (SGM) approaches have emerged for generating high-resolution regional forecasts, based on initial conditions from a regional (re)analysis. While LAM uses lateral boundaries from an external global model, SGM incorporates a global domain at lower resolution. This study aims to understand how the differences in model design impact relative performance and potential applications. Specifically, the strengths and weaknesses of these two approaches are identified for generating deterministic regional forecasts over Europe. Using the Anemoi framework, models of both types are built by minimally adapting a shared architecture and trained using global and regional reanalyses in a near-identical setup. Several inference experiments have been conducted to explore their relative performance and highlight key differences. Results show that both LAM and SGM are competitive deterministic MLWP models with generally accurate and comparable forecasting performance over the regional domain. Various differences were identified in the performance of the models across applications. LAM is able to successfully exploit high-quality boundary forcings to make predictions within the regional domain and is suitable in contexts where global data is difficult to acquire. SGM is fully self-contained for easier operationalisation, can take advantage of more training data and significantly surpasses LAM in terms of (temporal) generalisability. Our paper can serve as a starting point for meteorological institutes to guide their choice between LAM and SGM in developing an operational data-driven forecasting system.


UT-GraphCast Hindcast Dataset: A Global AI Forecast Archive from UT Austin for Weather and Climate Applications

Sudharsan, Naveen, Singh, Manmeet, Kamath, Harsh, Dashtian, Hassan, Dawson, Clint, Yang, Zong-Liang, Niyogi, Dev

arXiv.org Artificial Intelligence

Executive Summary The UT-GraphCast Hindcast Dataset (1979-2024) is a comprehensive global weather forecast archive generated using the Google DeepMind GraphCast Operational model. Developed by researchers at The University of Texas at Austin and published under the WCRP umbrella, this dataset provides daily 15 day deterministic forecasts at 00 UTC on a 0.25 0.25 global grid ( 25 km) for a 45-year period. It predicts more than a dozen key atmospheric and surface variables on 37 vertical levels, delivering a full medium-range forecast in under one minute on modern hardware. This new hindcast archive enables retrospective studies of historical weather, climate variability, and extreme events with unprecedented spatial and temporal detail. Preliminary validation shows that GraphCast forecasts generally reproduce ERA5 conditions with high fidelity and skill comparable or superior to conventional numerical models up to 10-15 days. In particular, GraphCast is known to outperform the state-of-the-art ECMWF IFS High-Resolution model (HRES) [Lam et al., 2023] on most verification targets, and to predict severe events (e.g., tropical cyclones, atmospheric rivers, heatwaves) with excellent accuracy. These benchmarks suggest that the GraphCast hindcast will be a valuable tool for climate and weather research.


Deep Spatio-Temporal Neural Network for Air Quality Reanalysis

Kheder, Ammar, Foreback, Benjamin, Wang, Lili, Liu, Zhi-Song, Boy, Michael

arXiv.org Artificial Intelligence

Air quality prediction is key to mitigating health impacts and guiding decisions, yet existing models tend to focus on temporal trends while overlooking spatial generalization. We propose AQ-Net, a spatiotemporal reanalysis model for both observed and unobserved stations in the near future. AQ-Net utilizes the LSTM and multi-head attention for the temporal regression. We also propose a cyclic encoding technique to ensure continuous time representation. To learn fine-grained spatial air quality estimation, we incorporate AQ-Net with the neural kNN to explore feature-based interpolation, such that we can fill the spatial gaps given coarse observation stations. To demonstrate the efficiency of our model for spatiotemporal reanalysis, we use data from 2013-2017 collected in northern China for PM2.5 analysis. Extensive experiments show that AQ-Net excels in air quality reanalysis, highlighting the potential of hybrid spatio-temporal models to better capture environmental dynamics, especially in urban areas where both spatial and temporal variability are critical.