Learning Data-Driven Uncertainty Set Partitions for Robust and Adaptive Energy Forecasting with Missing Data

Stratigakos, Akylas, Andrianesis, Panagiotis

arXiv.org Machine Learning 

--Short-term forecasting models typically assume the availability of input data (features) when they are deployed and in use. However, equipment failures, disruptions, cyberattacks, may lead to missing features when such models are used operationally, which could negatively affect forecast accuracy, and result in suboptimal operational decisions. In this paper, we use adaptive robust optimization and adversarial machine learning to develop forecasting models that seamlessly handle missing data operationally. We propose linear-and neural network-based forecasting models with parameters that adapt to available features, combining linear adaptation with a novel algorithm for learning data-driven uncertainty set partitions. The proposed adaptive models do not rely on identifying historical missing data patterns and are suitable for real-time operations under stringent time constraints. Extensive numerical experiments on short-term wind power forecasting considering horizons from 15 minutes to 4 hours ahead illustrate that our proposed adaptive models are on par with imputation when data are missing for very short periods (e.g., when only the latest measurement is missing) whereas they significantly outperform imputation when data are missing for longer periods. We further provide insights by showcasing how linear adaptation and data-driven partitions (even with a few subsets) approach the performance of the optimal, yet impractical, method of retraining for every possible realization of missing data. Index T erms--Short-term forecasting, wind power forecasting, missing data, adaptive robust optimization, data-driven uncertainty set partitioning, adversarial learning. V ariable renewable energy sources, such as wind and solar, dominate low-carbon power systems. To deal with their inherent uncertainty and variability, system operators manage operational risk based on a forward-looking grid status estimation [1]. For instance, they run short-term scheduling applications to evaluate the reliability of market-based dispatch, which are based on short-term energy forecasts with a horizon ranging from a few minutes to several hours ahead [2]. A. Background and Motivation A critical assumption underpinning the forecasting models is that input data, a.k.a.