Goto

Collaborating Authors

 Energy


DeepSoCS: A Neural Scheduler for Heterogeneous System-on-Chip (SoC) Resource Scheduling

arXiv.org Artificial Intelligence

In this paper, we~present a novel scheduling solution for a class of System-on-Chip (SoC) systems where heterogeneous chip resources (DSP, FPGA, GPU, etc.) must be efficiently scheduled for continuously arriving hierarchical jobs with their tasks represented by a directed acyclic graph. Traditionally, heuristic algorithms have been widely used for many resource scheduling domains, and Heterogeneous Earliest Finish Time (HEFT) has been a dominating state-of-the-art technique across a broad range of heterogeneous resource scheduling domains over many years. Despite their long-standing popularity, HEFT-like algorithms are known to be vulnerable to a small amount of noise added to the environment. Our Deep Reinforcement Learning (DRL)-based SoC Scheduler (DeepSoCS), capable of learning the "best" task ordering under dynamic environment changes, overcomes the brittleness of rule-based schedulers such as HEFT with significantly higher performance across different types of jobs. We~describe a DeepSoCS design process using a real-time heterogeneous SoC scheduling emulator, discuss major challenges, and present two novel neural network design features that lead to outperforming HEFT: (i) hierarchical job- and task-graph embedding; and (ii) efficient use of real-time task information in the state space. Furthermore, we~introduce effective techniques to address two fundamental challenges present in our environment: delayed consequences and joint actions. Through an extensive simulation study, we~show that our DeepSoCS exhibits the significantly higher performance of job execution time than that of HEFT with a higher level of robustness under realistic noise conditions. We~conclude with a discussion of the potential improvements for our DeepSoCS neural scheduler.


Beyond Domain APIs: Task-oriented Conversational Modeling with Unstructured Knowledge Access

arXiv.org Artificial Intelligence

Most prior work on task-oriented dialogue systems are restricted to a limited coverage of domain APIs, while users oftentimes have domain related requests that are not covered by the APIs. In this paper, we propose to expand coverage of task-oriented dialogue systems by incorporating external unstructured knowledge sources. We define three sub-tasks: knowledge-seeking turn detection, knowledge selection, and knowledge-grounded response generation, which can be modeled individually or jointly. We introduce an augmented version of MultiWOZ 2.1, which includes new out-of-API-coverage turns and responses grounded on external knowledge sources. We present baselines for each sub-task using both conventional and neural approaches. Our experimental results demonstrate the need for further research in this direction to enable more informative conversational systems.


Hierarchical robust aggregation of sales forecasts at aggregated levels in e-commerce, based on exponential smoothing and Holt's linear trend method

arXiv.org Machine Learning

We revisit the interest of classical statistical techniques for sales forecasting like exponential smoothing and extensions thereof (as Holt's linear trend method). We do so by considering ensemble forecasts, given by several instances of these classical techniques tuned with different (sets of) parameters, and by forming convex combinations of the elements of ensemble forecasts over time, in a robust and sequential manner. The machine-learning theory behind this is called "robust online aggregation", or "prediction with expert advice", or "prediction of individual sequences" (see Cesa-Bianchi and Lugosi, 2006). We apply this methodology to a hierarchical data set of sales provided by the e-commerce company Cdiscount and output forecasts at the levels of subsubfamilies, subfamilies and families of items sold, for various forecasting horizons (up to 6-week-ahead). The performance achieved is better than what would be obtained by optimally tuning the classical techniques on a train set and using their forecasts on the test set. The performance is also good from an intrinsic point of view (in terms of mean absolute percentage of error). While getting these better forecasts of sales at the levels of subsubfamilies, subfamilies and families is interesting per se, we also suggest to use them as additional features when forecasting demand at the item level.


Sponge Examples: Energy-Latency Attacks on Neural Networks

arXiv.org Machine Learning

The high energy costs of neural network training and inference led to the use of acceleration hardware such as GPUs and TPUs. While this enabled us to train large-scale neural networks in datacenters and deploy them on edge devices, the focus so far is on average-case performance. In this work, we introduce a novel threat vector against neural networks whose energy consumption or decision latency are critical. We show how adversaries can exploit carefully crafted $\boldsymbol{sponge}~\boldsymbol{examples}$, which are inputs designed to maximise energy consumption and latency. We mount two variants of this attack on established vision and language models, increasing energy consumption by a factor of 10 to 200. Our attacks can also be used to delay decisions where a network has critical real-time performance, such as in perception for autonomous vehicles. We demonstrate the portability of our malicious inputs across CPUs and a variety of hardware accelerator chips including GPUs, and an ASIC simulator. We conclude by proposing a defense strategy which mitigates our attack by shifting the analysis of energy consumption in hardware from an average-case to a worst-case perspective.


Self-Supervised Encoder for Fault Prediction in Electrochemical Cells

arXiv.org Machine Learning

Predicting faults before they occur helps to avoid potential safety hazards. Furthermore, planning the required maintenance actions in advance reduces operation costs. In this article, the focus is on electrochemical cells. In order to predict a cell's fault, the typical approach is to estimate the expected voltage that a healthy cell would present and compare it with the cell's measured voltage in real-time. This approach is possible because, when a fault is about to happen, the cell's measured voltage differs from the one expected for the same operating conditions. However, estimating the expected voltage is challenging, as the voltage of a healthy cell is also affected by its degradation -- an unknown parameter. Expert-defined parametric models are currently used for this estimation task. Instead, we propose the use of a neural network model based on an encoder-decoder architecture. The network receives the operating conditions as input. The encoder's task is to find a faithful representation of the cell's degradation and to pass it to the decoder, which in turn predicts the expected cell's voltage. As no labeled degradation data is given to the network, we consider our approach to be a self-supervised encoder. Results show that we were able to predict the voltage of multiple cells while diminishing the prediction error that was obtained by the parametric models by 53%. This improvement enabled our network to predict a fault 31 hours before it happened, a 64% increase in reaction time compared to the parametric model. Moreover, the output of the encoder can be plotted, adding interpretability to the neural network model.


Neural Network Middle-Term Probabilistic Forecasting of Daily Power Consumption

arXiv.org Machine Learning

Middle-term horizon (months to a year) power consumption prediction is a main challenge in the energy sector, in particular when probabilistic forecasting is considered. We propose a new modelling approach that incorporates trend, seasonality and weather conditions, as explicative variables in a shallow Neural Network with an autoregressive feature. We obtain excellent results for density forecast on the one-year test set applying it to the daily power consumption in New England U.S.A.. The quality of the achieved power consumption probabilistic forecasting has been verified, on the one hand, comparing the results to other standard models for density forecasting and, on the other hand, considering measures that are frequently used in the energy sector as pinball loss and CI backtesting.


Is Fusion Really Close To Reality? Yes, Thanks To Machine Learning

#artificialintelligence

Fusion is energy's boy who cried wolf. Fusion is energy's boy who cried wolf. It's been just around the corner for so long that people can't believe it's just around the corner now. "As a physicist, we always joke that fusion has been 50 years away for 50 years," said Daniel Kammen, a professor of energy at the University of California, Berkeley. "But in the last four or five years, with the effort that's going on here, the effort that's going on with Commonwealth Fusion in Massachusetts, you're suddenly seeing that old idea--that fusion is great but infinitely far away--has gone away."


Differentiable Linear Bandit Algorithm

arXiv.org Artificial Intelligence

Upper Confidence Bound (UCB) is arguably the most commonly used method for linear multi-arm bandit problems. While conceptually and computationally simple, this method highly relies on the confidence bounds, failing to strike the optimal exploration-exploitation if these bounds are not properly set. In the literature, confidence bounds are typically derived from concentration inequalities based on assumptions on the reward distribution, e.g., sub-Gaussianity. The validity of these assumptions however is unknown in practice. In this work, we aim at learning the confidence bound in a data-driven fashion, making it adaptive to the actual problem structure. Specifically, noting that existing UCB-typed algorithms are not differentiable with respect to confidence bound, we first propose a novel differentiable linear bandit algorithm. Then, we introduce a gradient estimator, which allows the confidence bound to be learned via gradient ascent. Theoretically, we show that the proposed algorithm achieves a $\tilde{\mathcal{O}}(\hat{\beta}\sqrt{dT})$ upper bound of $T$-round regret, where $d$ is the dimension of arm features and $\hat{\beta}$ is the learned size of confidence bound. Empirical results show that $\hat{\beta}$ is significantly smaller than its theoretical upper bound and proposed algorithms outperforms baseline ones on both simulated and real-world datasets.


DASC: Towards A Road Damage-Aware Social-Media-Driven Car Sensing Framework for Disaster Response Applications

arXiv.org Machine Learning

While vehicular sensor networks (VSNs) have earned the stature of a mobile sensing paradigm utilizing sensors built into cars, they have limited sensing scopes since car drivers only opportunistically discover new events. Conversely, social sensing is emerging as a new sensing paradigm where measurements about the physical world are collected from humans. In contrast to VSNs, social sensing is more pervasive, but one of its key limitations lies in its inconsistent reliability stemming from the data contributed by unreliable human sensors. In this paper, we present DASC, a road Damage-Aware Social-media-driven Car sensing framework that exploits the collective power of social sensing and VSNs for reliable disaster response applications. However, integrating VSNs with social sensing introduces a new set of challenges: i) How to leverage noisy and unreliable social signals to route the vehicles to accurate regions of interest? ii) How to tackle the inconsistent availability (e.g., churns) caused by car drivers being rational actors? iii) How to efficiently guide the cars to the event locations with little prior knowledge of the road damage caused by the disaster, while also handling the dynamics of the physical world and social media? The DASC framework addresses the above challenges by establishing a novel hybrid social-car sensing system that employs techniques from game theory, feedback control, and Markov Decision Process (MDP). In particular, DASC distills signals emitted from social media and discovers the road damages to effectively drive cars to target areas for verifying emergency events. We implement and evaluate DASC in a reputed vehicle simulator that can emulate real-world disaster response scenarios. The results of a real-world application demonstrate the superiority of DASC over current VSNs-based solutions in detection accuracy and efficiency.


Constrained Reinforcement Learning for Dynamic Optimization under Uncertainty

arXiv.org Machine Learning

Dynamic real-time optimization (DRTO) is a challenging task due to the fact that optimal operating conditions must be computed in real time. The main bottleneck in the industrial application of DRTO is the presence of uncertainty. Many stochastic systems present the following obstacles: 1) plant-model mismatch, 2) process disturbances, 3) risks in violation of process constraints. To accommodate these difficulties, we present a constrained reinforcement learning (RL) based approach. RL naturally handles the process uncertainty by computing an optimal feedback policy. However, no state constraints can be introduced intuitively. To address this problem, we present a chance-constrained RL methodology. We use chance constraints to guarantee the probabilistic satisfaction of process constraints, which is accomplished by introducing backoffs, such that the optimal policy and backoffs are computed simultaneously. Backoffs are adjusted using the empirical cumulative distribution function to guarantee the satisfaction of a joint chance constraint. The advantage and performance of this strategy are illustrated through a stochastic dynamic bioprocess optimization problem, to produce sustainable high-value bioproducts.