Goto

Collaborating Authors

 forecast


Distillation and Interpretability of Ensemble Forecasts of ENSO Phase using Entropic Learning

Groom, Michael, Bassetti, Davide, Horenko, Illia, O'Kane, Terence J.

arXiv.org Machine Learning

This paper introduces a distillation framework for an ensemble of entropy-optimal Sparse Probabilistic Approximation (eSPA) models, trained exclusively on satellite-era observational and reanalysis data to predict ENSO phase up to 24 months in advance. While eSPA ensembles yield state-of-the-art forecast skill, they are harder to interpret than individual eSPA models. We show how to compress the ensemble into a compact set of "distilled" models by aggregating the structure of only those ensemble members that make correct predictions. This process yields a single, diagnostically tractable model for each forecast lead time that preserves forecast performance while also enabling diagnostics that are impractical to implement on the full ensemble. An analysis of the regime persistence of the distilled model "superclusters", as well as cross-lead clustering consistency, shows that the discretised system accurately captures the spatiotemporal dynamics of ENSO. By considering the effective dimension of the feature importance vectors, the complexity of the input space required for correct ENSO phase prediction is shown to peak when forecasts must cross the boreal spring predictability barrier. Spatial importance maps derived from the feature importance vectors are introduced to identify where predictive information resides in each field and are shown to include known physical precursors at certain lead times. Case studies of key events are also presented, showing how fields reconstructed from distilled model centroids trace the evolution from extratropical and inter-basin precursors to the mature ENSO state. Overall, the distillation framework enables a rigorous investigation of long-range ENSO predictability that complements real-time data-driven operational forecasts.






Scaling transformer neural networks for skillful and reliable medium-range weather forecasting Tung Nguyen

Neural Information Processing Systems

Recently, data-driven approaches for weather forecasting based on deep learning have shown great promise, achieving accuracies that are competitive with operational systems. However, those methods often employ complex, customized architectures without sufficient ablation analysis, making it difficult to understand what truly contributes to their success.


A Appendix 564 B Diffusion process as ODE

Neural Information Processing Systems

In this section, we show that Cold Sampling is an approximation of the Euler method for (5). The intuition is as follows. B.2 Why is cold sampling better than naive sampling? Naive sampling does not have this property. The proof relied on applying definitions of Lipschitz functions twice.