Trustworthy Prediction with Gaussian Process Knowledge Scores

Butler, Kurt, Feng, Guanchao, Chen, Tong, Djuric, Petar

arXiv.org Machine Learning 

--Probabilistic models are often used to make predictions in regions of the data space where no observations are available, but it is not always clear whether such predictions are well-informed by previously seen data. In this paper, we propose a knowledge score for predictions from Gaussian process regression (GPR) models that quantifies the extent to which observing data have reduced our uncertainty about a prediction. The knowledge score is interpretable and naturally bounded between 0 and 1. We demonstrate in several experiments that the knowledge score can anticipate when predictions from a GPR model are accurate, and that this anticipation improves performance in tasks such as anomaly detection, extrapolation, and missing data imputation. Index T erms --anomaly detection, Gaussian processes, regression models, trustworthy machine learning, predictive distributions. The task of prediction is of fundamental importance in many domains.