Goto

Collaborating Authors

 ensemble prediction


Trajectory learning for ensemble forecasts via the continuous ranked probability score: a Lorenz '96 case study

Ephrati, Sagy, Woodfield, James

arXiv.org Artificial Intelligence

This paper demonstrates the feasibility of trajectory learning for ensemble forecasts by employing the continuous ranked probability score (CRPS) as a loss function. Using the two-scale Lorenz '96 system as a case study, we develop and train both additive and multiplicative stochastic parametrizations to generate ensemble predictions. Results indicate that CRPS-based trajectory learning produces parametrizations that are both accurate and sharp. The resulting parametrizations are straightforward to calibrate and outperform derivative-fitting-based parametrizations in short-term forecasts. This approach is particularly promising for data assimilation applications due to its accuracy over short lead times.


Scalable Semi-Supervised Aggregation of Classifiers

Akshay Balsubramani, Yoav Freund

Neural Information Processing Systems

We present and empirically evaluate an efficient algorithm that learns to aggregate the predictions of an ensemble of binary classifiers. The algorithm uses the structure of the ensemble predictions on unlabeled data to yield significant performance improvements. It does this without making assumptions on the structure or origin of the ensemble, without parameters, and as scalably as linear learning. We empirically demonstrate these performance gains with random forests.



Flow Matching for Probabilistic Learning of Dynamical Systems from Missing or Noisy Data

Rout, Siddharth, Haber, Eldad, Gaudreault, Stephane

arXiv.org Artificial Intelligence

Learning dynamical systems is crucial across many fields, yet applying machine learning techniques remains challenging due to missing variables and noisy data. Classical mathematical models often struggle in these scenarios due to the arose ill-posedness of the physical systems. Stochastic machine learning techniques address this challenge by enabling the modeling of such ill-posed problems. Thus, a single known input to the trained machine learning model may yield multiple plausible outputs, and all of the outputs are correct. In such scenarios, probabilistic forecasting is inherently meaningful. In this study, we introduce a variant of flow matching for probabilistic forecasting which estimates possible future states as a distribution over possible outcomes rather than a single-point prediction. Perturbation of complex dynamical states is not trivial. Community uses typical Gaussian or uniform perturbations to crucial variables to model uncertainty. However, not all variables behave in a Gaussian fashion. So, we also propose a generative machine learning approach to physically and logically perturb the states of complex high-dimensional dynamical systems. Finally, we establish the mathematical foundations of our method and demonstrate its effectiveness on several challenging dynamical systems, including a variant of the high-dimensional WeatherBench dataset, which models the global weather at a 5.625° meridional resolution.


A Probabilistic Approach to Wildfire Spread Prediction Using a Denoising Diffusion Surrogate Model

Yu, Wenbo, Ghosh, Anirbit, Finn, Tobias Sebastian, Arcucci, Rossella, Bocquet, Marc, Cheng, Sibo

arXiv.org Artificial Intelligence

We propose a stochastic framework for wildfire spread prediction using deep generative diffusion models with ensemble sampling. In contrast to traditional deterministic approaches that struggle to capture the inherent uncertainty and variability of wildfire dynamics, our method generates probabilistic forecasts by sampling multiple plausible future scenarios conditioned on the same initial state. As a proof-of-concept, the model is trained on synthetic wildfire data generated by a probabilistic cellular automata-based simulator, which integrates realistic environmental features such as canopy cover, vegetation density, and terrain slope, and is grounded in historical fire events including the Chimney and Ferguson fires. To assess predictive performance and uncertainty modelling, we compare two surrogate models with identical network architecture: one trained via conventional supervised regression, and the other using a conditional diffusion framework with ensemble sampling. In the diffusion-based emulator, multiple inference passes are performed for the same input state by resampling the initial latent variable, allowing the model to capture a distribution of possible outcomes.


Probabilistic Forecasting for Dynamical Systems with Missing or Imperfect Data

Rout, Siddharth, Haber, Eldad, Gaudreault, Stéphane

arXiv.org Artificial Intelligence

The modeling of dynamical systems is essential in many fields, but applying machine learning techniques is often challenging due to incomplete or noisy data. This study introduces a variant of stochastic interpolation (SI) for probabilistic forecasting, estimating future states as distributions rather than single-point predictions. We explore its mathematical foundations and demonstrate its effectiveness on various dynamical systems, including the challenging WeatherBench dataset.


Uncertainty-aware Long-tailed Weights Model the Utility of Pseudo-labels for Semi-supervised Learning

Wu, Jiaqi, Pang, Junbiao, Huang, Qingming

arXiv.org Artificial Intelligence

Current Semi-supervised Learning (SSL) adopts the pseudo-labeling strategy and further filters pseudo-labels based on confidence thresholds. However, this mechanism has notable drawbacks: 1) setting the reasonable threshold is an open problem which significantly influences the selection of the high-quality pseudo-labels; and 2) deep models often exhibit the over-confidence phenomenon which makes the confidence value an unreliable indicator for assessing the quality of pseudo-labels due to the scarcity of labeled data. In this paper, we propose an Uncertainty-aware Ensemble Structure (UES) to assess the utility of pseudo-labels for unlabeled samples. We further model the utility of pseudo-labels as long-tailed weights to avoid the open problem of setting the threshold. Concretely, the advantage of the long-tailed weights ensures that even unreliable pseudo-labels still contribute to enhancing the model's robustness. Besides, UES is lightweight and architecture-agnostic, easily extending to various computer vision tasks, including classification and regression. Experimental results demonstrate that combining the proposed method with DualPose leads to a 3.47% improvement in Percentage of Correct Keypoints (PCK) on the Sniffing dataset with 100 data points (30 labeled), a 7.29\% improvement in PCK on the FLIC dataset with 100 data points (50 labeled), and a 3.91% improvement in PCK on the LSP dataset with 200 data points (100 labeled). Furthermore, when combined with FixMatch, the proposed method achieves a 0.2% accuracy improvement on the CIFAR-10 dataset with 40 labeled data points and a 0.26% accuracy improvement on the CIFAR-100 dataset with 400 labeled data points.


Test-Time Alignment via Hypothesis Reweighting

Lee, Yoonho, Williams, Jonathan, Marklund, Henrik, Sharma, Archit, Mitchell, Eric, Singh, Anikait, Finn, Chelsea

arXiv.org Artificial Intelligence

Large pretrained models often struggle with underspecified tasks -- situations where the training data does not fully define the desired behavior. For example, chatbots must handle diverse and often conflicting user preferences, requiring adaptability to various user needs. We propose a novel framework to address the general challenge of aligning models to test-time user intent, which is rarely fully specified during training. Our approach involves training an efficient ensemble, i.e., a single neural network with multiple prediction heads, each representing a different function consistent with the training data. Our main contribution is HyRe, a simple adaptation technique that dynamically reweights ensemble members at test time using a small set of labeled examples from the target distribution, which can be labeled in advance or actively queried from a larger unlabeled pool. By leveraging recent advances in scalable ensemble training, our method scales to large pretrained models, with computational costs comparable to fine-tuning a single model. We empirically validate HyRe in several underspecified scenarios, including personalization tasks and settings with distribution shifts. Additionally, with just five preference pairs from each target distribution, the same ensemble adapted via HyRe outperforms the prior state-of-the-art 2B-parameter reward model accuracy across 18 evaluation distributions.


Simmering: Sufficient is better than optimal for training neural networks

Babayan, Irina, Aliahmadi, Hazhir, van Anders, Greg

arXiv.org Artificial Intelligence

The broad range of neural network training techniques that invoke optimization but rely on ad hoc modification for validity [1? -4] suggests that optimization-based training is misguided. Shortcomings of optimization-based training are brought to particularly strong relief by the problem of overfitting, where naive optimization produces spurious outcomes.[5-7] The broad success of neural networks for modelling physical processes [8-12] has prompted advances that are based on inverting the direction of investigation and treating neural networks as if they were physical systems in their own right.[13-16] These successes raise the question of whether broader, physical perspectives could motivate the construction of improved training algorithms. Here, we introduce simmering, a physics-based method that trains neural networks to generate weights and biases that are merely "good enough", but which, paradoxically, outperforms leading optimization-based approaches. Using classification and regression examples we show that simmering corrects neural networks that are overfit by Adam [17], and show that simmering avoids overfitting if deployed from the outset. Our results question optimization as a paradigm for neural network training, and leverage information-geometric arguments to point to the existence of classes of sufficient training algorithms that do not take optimization as their starting point.


Ensemble Prediction via Covariate-dependent Stacking

Wakayama, Tomoya, Sugasawa, Shonosuke

arXiv.org Machine Learning

This study proposes a novel approach to ensemble prediction, called ``covariate-dependent stacking'' (CDST). Unlike traditional stacking methods, CDST allows model weights to vary flexibly as a function of covariates, thereby enhancing predictive performance in complex scenarios. We formulate the covariate-dependent weights through combinations of basis functions, estimate them by optimizing cross-validation, and develop an expectation-maximization algorithm, ensuring computational efficiency. To analyze the theoretical properties, we establish an oracle inequality regarding the expected loss to be minimized for estimating model weights. Through comprehensive simulation studies and an application to large-scale land price prediction, we demonstrate that the CDST consistently outperforms conventional model averaging methods, particularly on datasets where some models fail to capture the underlying complexity. Our findings suggest that the CDST is especially valuable for, but not limited to, spatio-temporal prediction problems, offering a powerful tool for researchers and practitioners in various data analysis fields.