Many real-world applications require the prediction of long sequence time-series, such as electricity consumption planning. Long sequence time-series forecasting (LSTF) demands a high prediction capacity of the model, which is the ability to capture precise long-range dependency coupling between output and input efficiently. Recent studies have shown the potential of Transformer to increase the prediction capacity. However, there are several severe issues with Transformer that prevent it from being directly applicable to LSTF, such as quadratic time complexity, high memory usage, and inherent limitation of the encoder-decoder architecture. To address these issues, we design an efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics: (i) a $ProbSparse$ Self-attention mechanism, which achieves $O(L \log L)$ in time complexity and memory usage, and has comparable performance on sequences' dependency alignment. (ii) the self-attention distilling highlights dominating attention by halving cascading layer input, and efficiently handles extreme long input sequences. (iii) the generative style decoder, while conceptually simple, predicts the long time-series sequences at one forward operation rather than a step-by-step way, which drastically improves the inference speed of long-sequence predictions. Extensive experiments on four large-scale datasets demonstrate that Informer significantly outperforms existing methods and provides a new solution to the LSTF problem.
Active learning (AL) could contribute to solving critical environmental problems through improved spatiotemporal predictions. Yet such predictions involve high-dimensional feature spaces with mixed data types and missing data, which existing methods have difficulties dealing with. Here, we propose a novel batch AL method that fills this gap. We encode and cluster features of candidate data points, and query the best data based on the distance of embedded features to their cluster centers. We introduce a new metric of informativeness that we call embedding entropy and a general class of neural networks that we call embedding networks for using it. Empirical tests on forecasting electricity demand show a simultaneous reduction in average prediction RMSE by up to 63-88% and data usage by up to 50-69% compared to passive learning (PL) benchmarks. Examples include the electricity consumption of buildings, required to operate sustainable power grids; the travel time between city zones, required for the smart charging of electric vehicles; and meteorological conditions, required for weather-based forecasting of wind and solar electricity generation. Sensing and labeling the ground truth data that is necessary for making these predictions in time and space usually comes at a high cost. This cost constrains the total number of sensors that we can place and use to query new data. A fundamental question that arises for many spatiotemporal prediction tasks is where and when to measure and query the data required to make the best possible predictions while staying within a maximum budget for sensors and data.
In the Internet of Things (IoT) era, billions of sensors and devices collect and process data from the environment, transmit them to cloud centers, and receive feedback via the internet for connectivity and perception. However, transmitting massive amounts of heterogeneous data, perceiving complex environments from these data, and then making smart decisions in a timely manner are difficult. Artificial intelligence (AI), especially deep learning, is now a proven success in various areas including computer vision, speech recognition, and natural language processing. AI introduced into the IoT heralds the era of artificial intelligence of things (AIoT). This paper presents a comprehensive survey on AIoT to show how AI can empower the IoT to make it faster, smarter, greener, and safer. Specifically, we briefly present the AIoT architecture in the context of cloud computing, fog computing, and edge computing. Then, we present progress in AI research for IoT from four perspectives: perceiving, learning, reasoning, and behaving. Next, we summarize some promising applications of AIoT that are likely to profoundly reshape our world. Finally, we highlight the challenges facing AIoT and some potential research opportunities.
Uncertainty quantification (UQ) plays a pivotal role in reduction of uncertainties during both optimization and decision making processes. It can be applied to solve a variety of real-world applications in science and engineering. Bayesian approximation and ensemble learning techniques are two most widely-used UQ methods in the literature. In this regard, researchers have proposed different UQ methods and examined their performance in a variety of applications such as computer vision (e.g., self-driving cars and object detection), image processing (e.g., image restoration), medical image analysis (e.g., medical image classification and segmentation), natural language processing (e.g., text classification, social media texts and recidivism risk-scoring), bioinformatics, etc.This study reviews recent advances in UQ methods used in deep learning. Moreover, we also investigate the application of these methods in reinforcement learning (RL). Then, we outline a few important applications of UQ methods. Finally, we briefly highlight the fundamental research challenges faced by UQ methods and discuss the future research directions in this field.
High-resolution data are desired in many data-driven applications; however, in many cases only data whose resolution is lower than expected are available due to various reasons. It is then a challenge how to obtain as much useful information as possible from the low-resolution data. In this paper, we target interval energy data collected by Advanced Metering Infrastructure (AMI), and propose a Super-Resolution Reconstruction (SRR) approach to upsample low-resolution (hourly) interval data into higher-resolution (15-minute) data using deep learning. Our preliminary results show that the proposed SRR approaches can achieve much improved performance compared to the baseline model.
Rising penetration levels of (residential) photovoltaic (PV) power as distributed energy resource pose a number of challenges to the electricity infrastructure. High quality, general tools to provide accurate forecasts of power production are urgently needed. In this article, we propose a supervised deep learning model for end-to-end forecasting of PV power production. The proposed model is based on two seminal concepts that led to significant performance improvements of deep learning approaches in other sequence-related fields, but not yet in the area of time series prediction: the sequence to sequence architecture and attention mechanism as a context generator. The proposed model leverages numerical weather predictions and high-resolution historical measurements to forecast a binned probability distribution over the prognostic time intervals, rather than the expected values of the prognostic variable. This design offers significant performance improvements compared to common baseline approaches, such as fully connected neural networks and one-block long short-term memory architectures. Using normalized root mean square error based forecast skill score as a performance indicator, the proposed approach is compared to other models. The results show that the new design performs at or above the current state of the art of PV power forecasting.
Multivariate time series (MTS) regression tasks are common in many real-world data mining applications including finance, cybersecurity, energy, healthcare, prognostics, and many others. Due to the tremendous success of deep learning (DL) algorithms in various domains including image recognition and computer vision, researchers started adopting these techniques for solving MTS data mining problems, many of which are targeted for safety-critical and cost-critical applications. Unfortunately, DL algorithms are known for their susceptibility to adversarial examples which also makes the DL regression models for MTS forecasting also vulnerable to those attacks. To the best of our knowledge, no previous work has explored the vulnerability of DL MTS regression models to adversarial time series examples, which is an important step, specifically when the forecasting from such models is used in safety-critical and cost-critical applications. In this work, we leverage existing adversarial attack generation techniques from the image classification domain and craft adversarial multivariate time series examples for three state-of-the-art deep learning regression models, specifically Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU). We evaluate our study using Google stock and household power consumption dataset. The obtained results show that all the evaluated DL regression models are vulnerable to adversarial attacks, transferable, and thus can lead to catastrophic consequences in safety-critical and cost-critical domains, such as energy and finance.
Non-Intrusive Load Monitoring (NILM) is a field of research focused on segregating constituent electrical loads in a system based only on their aggregated signal. Significant computational resources and research time are spent training models, often using as much data as possible, perhaps driven by the preconception that more data equates to more accurate models and better performing algorithms. When has enough prior training been done? When has a NILM algorithm encountered new, unseen data? This work applies the notion of Bayesian surprise to answer these questions which are important for both supervised and unsupervised algorithms. We quantify the degree of surprise between the predictive distribution (termed postdictive surprise), as well as the transitional probabilities (termed transitional surprise), before and after a window of observations. We compare the performance of several benchmark NILM algorithms supported by NILMTK, in order to establish a useful threshold on the two combined measures of surprise. We validate the use of transitional surprise by exploring the performance of a popular Hidden Markov Model as a function of surprise threshold. Finally, we explore the use of a surprise threshold as a regularization technique to avoid overfitting in cross-dataset performance. Although the generality of the specific surprise threshold discussed herein may be suspect without further testing, this work provides clear evidence that a point of diminishing returns of model performance with respect to dataset size exists. This has implications for future model development, dataset acquisition, as well as aiding in model flexibility during deployment.
Advances in technology have made many household appliances more energy efficient, and even given outdated old ones some energy-saving smarts, but addressing the power usage of each individual device across the home is still a tall order. Researchers at Cornell University have been working on more of a one-size-fits-all solution, developing a vibration-sensing device that can keep tabs on appliance usage through machine learning and lasers. The team points to smart homes of the future as its inspiration for developing the VibroSense device, imagining scenarios where the house itself knows when a washing machine has completed its cycle, when a microwave has finished heating food or a faucet is dripping. While replacing each appliance with smart versions or attaching specific sensors to them could be one way to tackle this, the Cornell team sees a more efficient way forward. "In order to have a smart home at this point, you'd need each device to be smart, which is not realistic; or you'd need to install separate sensors on each device or in each area," says Cheng Zhang, assistant professor of information science and senior author of the study.
Energy theft causes large economic losses to utility companies around the world. In recent years, energy theft detection approaches based on machine learning (ML) techniques, especially neural networks, become popular in the research literature and achieve state-of-the-art detection performance. However, in this work, we demonstrate that the well-perform ML models for energy theft detection are highly vulnerable to adversarial attacks. In particular, we design an adversarial measurement generation algorithm that enables the attacker to report extremely low power consumption measurements to the utilities while bypassing the ML energy theft detection. We evaluate our approach with three kinds of neural networks based on a real-world smart meter dataset. The evaluation result demonstrates that our approach can significantly decrease the ML models' detection accuracy, even for black-box attackers.