Goto

Collaborating Authors

 tcc ensemble forecast


Machine learning for total cloud cover prediction

arXiv.org Machine Learning

Accurate and reliable forecasting of total cloud cover (TCC) is vital for many areas such as astronomy, energy demand and production, or agriculture. Most meteorological centres issue ensemble forecasts of TCC, however, these forecasts are often uncalibrated and exhibit worse forecast skill than ensemble forecasts of other weather variables. Hence, some form of post-processing is strongly required to improve predictive performance. As TCC observations are usually reported on a discrete scale taking just nine different values called oktas, statistical calibration of TCC ensemble forecasts can be considered a classification problem with outputs given by the probabilities of the oktas. This is a classical area where machine learning methods are applied. We investigate the performance of post-processing using multilayer percep-tron (MLP) neural networks, gradient boosting machines (GBM) and random forest (RF) methods. Based on the European Centre for Medium-Range Weather Forecasts global TCC ensemble forecasts for 2002-2014 we compare these approaches with the proportional odds logistic regression (POLR) and multiclass logistic regression (MLR) models, as well as the raw TCC ensemble forecasts. We further assess whether improvements in forecast skill can be obtained by incorporating ensemble forecasts of precipitation as additional predictor. Compared to the raw ensemble, all calibration methods result in a significant improvement in forecast skill. RF models provide the smallest increase in predictive performance, while MLP, POLR and GBM approaches perform best. Key words: ensemble calibration; gradient boosting machine; logistic regression; mul-tilayer perceptron; random forest; total cloud cover 1 Introduction Reliable and accurate prediction of total cloud cover (TCC) has a principal importance in observational astronomy (Ye and Chen, 2013) and in the prediction of photovoltaic energy production, as it is the main cause of variation in solar-radiation energy supply (Matuszko, 2012; McEvoy et al., 2012), but it is also of great relevance in agriculture, tourism and in some other fields of economy.