Active machine learning for spatio-temporal predictions using feature embedding

Aryandoust, Arsam, Pfenninger, Stefan

arXiv.org Machine Learning 

Active learning (AL) could contribute to solving critical environmental problems through improved spatiotemporal predictions. Yet such predictions involve high-dimensional feature spaces with mixed data types and missing data, which existing methods have difficulties dealing with. Here, we propose a novel batch AL method that fills this gap. We encode and cluster features of candidate data points, and query the best data based on the distance of embedded features to their cluster centers. We introduce a new metric of informativeness that we call embedding entropy and a general class of neural networks that we call embedding networks for using it. Empirical tests on forecasting electricity demand show a simultaneous reduction in average prediction RMSE by up to 63-88% and data usage by up to 50-69% compared to passive learning (PL) benchmarks. Examples include the electricity consumption of buildings, required to operate sustainable power grids; the travel time between city zones, required for the smart charging of electric vehicles; and meteorological conditions, required for weather-based forecasting of wind and solar electricity generation. Sensing and labeling the ground truth data that is necessary for making these predictions in time and space usually comes at a high cost. This cost constrains the total number of sensors that we can place and use to query new data. A fundamental question that arises for many spatiotemporal prediction tasks is where and when to measure and query the data required to make the best possible predictions while staying within a maximum budget for sensors and data.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found