Goto

Collaborating Authors

 picp


Bayesian Physics Informed Neural Networks for Reliable Transformer Prognostics

arXiv.org Artificial Intelligence

Scientific Machine Learning (SciML) integrates physics and data into the learning process, offering improved generalization compared with purely data-driven models. Despite its potential, applications of SciML in prognostics remain limited, partly due to the complexity of incorporating partial differential equations (PDEs) for ageing physics and the scarcity of robust uncertainty quantification methods. This work introduces a Bayesian Physics-Informed Neural Network (B-PINN) framework for probabilistic prognostics estimation. By embedding Bayesian Neural Networks into the PINN architecture, the proposed approach produces principled, uncertainty-aware predictions. The method is applied to a transformer ageing case study, where insulation degradation is primarily driven by thermal stress. The heat diffusion PDE is used as the physical residual, and different prior distributions are investigated to examine their impact on predictive posterior distributions and their ability to encode a priori physical knowledge. The framework is validated against a finite element model developed and tested with real measurements from a solar power plant. Results, benchmarked against a dropout-PINN baseline, show that the proposed B-PINN delivers more reliable prognostic predictions by accurately quantifying predictive uncertainty. This capability is crucial for supporting robust and informed maintenance decision-making in critical power assets.


Evaluating the Quality of the Quantified Uncertainty for (Re)Calibration of Data-Driven Regression Models

arXiv.org Machine Learning

In safety-critical applications data-driven models must not only be accurate but also provide reliable uncertainty estimates. This property, commonly referred to as calibration, is essential for risk-aware decision-making. In regression a wide variety of calibration metrics and recalibration methods have emerged. However, these metrics differ significantly in their definitions, assumptions and scales, making it difficult to interpret and compare results across studies. Moreover, most recalibration methods have been evaluated using only a small subset of metrics, leaving it unclear whether improvements generalize across different notions of calibration. In this work, we systematically extract and categorize regression calibration metrics from the literature and benchmark these metrics independently of specific modelling methods or recalibration approaches. Through controlled experiments with real-world, synthetic and artificially miscalibrated data, we demonstrate that calibration metrics frequently produce conflicting results. Our analysis reveals substantial inconsistencies: many metrics disagree in their evaluation of the same recalibration result, and some even indicate contradictory conclusions. This inconsistency is particularly concerning as it potentially allows cherry-picking of metrics to create misleading impressions of success. We identify the Expected Normalized Calibration Error (ENCE) and the Coverage Width-based Criterion (CWC) as the most dependable metrics in our tests. Our findings highlight the critical role of metric selection in calibration research.


Tube Loss: A Novel Approach for Prediction Interval Estimation and probabilistic forecasting

arXiv.org Artificial Intelligence

This paper proposes a novel loss function, called 'Tube Loss', for simultaneous estimation of bounds of a Prediction Interval (PI) in the regression setup, and also for generating probabilistic forecasts from time series data solving a single optimization problem. The PIs obtained by minimizing the empirical risk based on the Tube Loss are shown to be of better quality than the PIs obtained by the existing methods in the following sense. First, it yields intervals that attain the prespecified confidence level $t \in(0,1)$ asymptotically. A theoretical proof of this fact is given. Secondly, the user is allowed to move the interval up or down by controlling the value of a parameter. This helps the user to choose a PI capturing denser regions of the probability distribution of the response variable inside the interval, and thus, sharpening its width. This is shown to be especially useful when the conditional distribution of the response variable is skewed. Further, the Tube Loss based PI estimation method can trade-off between the coverage and the average width by solving a single optimization problem. It enables further reduction of the average width of PI through re-calibration. Also, unlike a few existing PI estimation methods the gradient descent (GD) method can be used for minimization of empirical risk. Finally, through extensive experimentation, we have shown the efficacy of the Tube Loss based PI estimation in kernel machines, neural networks and deep networks and also for probabilistic forecasting tasks. The codes of the experiments are available at https://github.com/ltpritamanand/Tube_loss


Large width penalization for neural network-based prediction interval estimation

arXiv.org Artificial Intelligence

Forecasting accuracy in highly uncertain environments is challenging due to the stochastic nature of systems. Deterministic forecasting provides only point estimates and cannot capture potential outcomes. Therefore, probabilistic forecasting has gained significant attention due to its ability to quantify uncertainty, where one of the approaches is to express it as a prediction interval (PI), that explicitly shows upper and lower bounds of predictions associated with a confidence level. High-quality PI is characterized by a high PI coverage probability (PICP) and a narrow PI width. In many real-world applications, the PI width is generally used in risk management to prepare resources that improve reliability and effectively manage uncertainty. A wider PI width results in higher costs for backup resources as decision-making processes often focus on the worst-case scenarios arising with large PI widths under extreme conditions. This study aims to reduce the large PI width from the PI estimation method by proposing a new PI loss function that penalizes the average of the large PI widths more heavily. The proposed formulation is compatible with gradient-based algorithms, the standard approach to training neural networks (NNs), and integrating state-of-the-art NNs and existing deep learning techniques. Experiments with the synthetic dataset reveal that our formulation significantly reduces the large PI width while effectively maintaining the PICP to achieve the desired probability. The practical implementation of our proposed loss function is demonstrated in solar irradiance forecasting, highlighting its effectiveness in minimizing the large PI width in data with high uncertainty and showcasing its compatibility with more complex neural network models. Therefore, reducing large PI widths from our method can lead to significant cost savings by over-allocation of reserve resources.


Hessian-Free Laplace in Bayesian Deep Learning

arXiv.org Machine Learning

The Laplace approximation (LA) of the Bayesian posterior is a Gaussian distribution centered at the maximum a posteriori estimate. Its appeal in Bayesian deep learning stems from the ability to quantify uncertainty post-hoc (i.e., after standard network parameter optimization), the ease of sampling from the approximate posterior, and the analytic form of model evidence. However, an important computational bottleneck of LA is the necessary step of calculating and inverting the Hessian matrix of the log posterior. The Hessian may be approximated in a variety of ways, with quality varying with a number of factors including the network, dataset, and inference task. In this paper, we propose an alternative framework that sidesteps Hessian calculation and inversion. The Hessian-free Laplace (HFL) approximation uses curvature of both the log posterior and network prediction to estimate its variance. Only two point estimates are needed: the standard maximum a posteriori parameter and the optimal parameter under a loss regularized by the network prediction. We show that, under standard assumptions of LA in Bayesian deep learning, HFL targets the same variance as LA, and can be efficiently amortized in a pre-trained network. Experiments demonstrate comparable performance to that of exact and approximate Hessians, with excellent coverage for in-between uncertainty.


Uncertainty Quantification in Neural-Network Based Pain Intensity Estimation

arXiv.org Artificial Intelligence

Improper pain management can lead to severe physical or mental consequences, including suffering, and an increased risk of opioid dependency. Assessing the presence and severity of pain is imperative to prevent such outcomes and determine the appropriate intervention. However, the evaluation of pain intensity is challenging because different individuals experience pain differently. To overcome this, researchers have employed machine learning models to evaluate pain intensity objectively. However, these efforts have primarily focused on point estimation of pain, disregarding the inherent uncertainty and variability present in the data and model. Consequently, the point estimates provide only partial information for clinical decision-making. This study presents a neural network-based method for objective pain interval estimation, incorporating uncertainty quantification. This work explores three algorithms: the bootstrap method, lower and upper bound estimation (LossL) optimized by genetic algorithm, and modified lower and upper bound estimation (LossS) optimized by gradient descent algorithm. Our empirical results reveal that LossS outperforms the other two by providing a narrower prediction interval. As LossS outperforms, we assessed its performance in three different scenarios for pain assessment: (1) a generalized approach (single model for the entire population), (2) a personalized approach (separate model for each individual), and (3) a hybrid approach (separate model for each cluster of individuals). Our findings demonstrate the hybrid approach's superior performance, with notable practicality in clinical contexts. It has the potential to be a valuable tool for clinicians, enabling objective pain intensity assessment while taking uncertainty into account. This capability is crucial in facilitating effective pain management and reducing the risks associated with improper treatment.


Structure and Distribution Metric for Quantifying the Quality of Uncertainty: Assessing Gaussian Processes, Deep Neural Nets, and Deep Neural Operators for Regression

arXiv.org Machine Learning

We propose two bounded comparison metrics that may be implemented to arbitrary dimensions in regression tasks. One quantifies the structure of uncertainty and the other quantifies the distribution of uncertainty. The structure metric assesses the similarity in shape and location of uncertainty with the true error, while the distribution metric quantifies the supported magnitudes between the two. We apply these metrics to Gaussian Processes (GPs), Ensemble Deep Neural Nets (DNNs), and Ensemble Deep Neural Operators (DNOs) on high-dimensional and nonlinear test cases. We find that comparing a model's uncertainty estimates with the model's squared error provides a compelling ground truth assessment. We also observe that both DNNs and DNOs, especially when compared to GPs, provide encouraging metric values in high dimensions with either sparse or plentiful data.


Accurate Prediction and Uncertainty Estimation using Decoupled Prediction Interval Networks

arXiv.org Machine Learning

We propose a network architecture capable of reliably estimating uncertainty of regression based predictions without sacrificing accuracy. The current state-of-the-art uncertainty algorithms either fall short of achieving prediction accuracy comparable to the mean square error optimization or underestimate the variance of network predictions. We propose a decoupled network architecture that is capable of accomplishing both at the same time. We achieve this by breaking down the learning of prediction and prediction interval (PI) estimations into a two-stage training process. We use a custom loss function for learning a PI range around optimized mean estimation with a desired coverage of a proportion of the target labels within the PI range. We compare the proposed method with current state-of-the-art uncertainty quantification algorithms on synthetic datasets and UCI benchmarks, reducing the error in the predictions by 23 to 34% while maintaining 95% Prediction Interval Coverage Probability (PICP) for 7 out of 9 UCI benchmark datasets. We also examine the quality of our predictive uncertainty by evaluating on Active Learning and demonstrating 17 to 36% error reduction on UCI benchmarks.


How to Evaluate Uncertainty Estimates in Machine Learning for Regression?

arXiv.org Machine Learning

As neural networks become more popular, the need for accompanying uncertainty estimates increases. The current testing methodology focusses on how good the predictive uncertainty estimates explain the differences between predictions and observations in a previously unseen test set. Intuitively this is a logical approach. The current setup of benchmark data sets also allows easy comparison between the different methods. We demonstrate, however, through both theoretical arguments and simulations that this way of evaluating the quality of uncertainty estimates has serious flaws. Firstly, it cannot disentangle the aleatoric from the epistemic uncertainty. Secondly, the current methodology considers the uncertainty averaged over all test samples, implicitly averaging out overconfident and underconfident predictions. When checking if the correct fraction of test points falls inside prediction intervals, a good score on average gives no guarantee that the intervals are sensible for individual points. We demonstrate through practical examples that these effects can result in favoring a method, based on the predictive uncertainty, that has undesirable behaviour of the confidence intervals. Finally, we propose a simulation-based testing approach that addresses these problems while still allowing easy comparison between different methods.


Optimal Uncertainty-guided Neural Network Training

arXiv.org Machine Learning

The neural network (NN)-based direct uncertainty quantification (UQ) methods have achieved the state of the art performance since the first inauguration, known as the lower-upper-bound estimation (LUBE) method. However, currently-available cost functions for uncertainty guided NN training are not always converging and all converged NNs are not generating optimized prediction intervals (PIs). Moreover, several groups have proposed different quality criteria for PIs. These raise a question about their relative effectiveness. Most of the existing cost functions of uncertainty guided NN training are not customizable and the convergence of training is uncertain. Therefore, in this paper, we propose a highly customizable smooth cost function for developing NNs to construct optimal PIs. The optimized average width of PIs, PI-failure distances and the PI coverage probability (PICP) are computed for the test dataset. The performance of the proposed method is examined for the wind power generation and the electricity demand data. Results show that the proposed method reduces variation in the quality of PIs, accelerates the training, and improves convergence probability from 99.2% to 99.8%.