Goto

Collaborating Authors

 Energy


Challenges and Countermeasures for Adversarial Attacks on Deep Reinforcement Learning

arXiv.org Artificial Intelligence

Deep Reinforcement Learning (DRL) has numerous applications in the real world thanks to its outstanding ability in quickly adapting to the surrounding environments. Despite its great advantages, DRL is susceptible to adversarial attacks, which precludes its use in real-life critical systems and applications (e.g., smart grids, traffic controls, and autonomous vehicles) unless its vulnerabilities are addressed and mitigated. Thus, this paper provides a comprehensive survey that discusses emerging attacks in DRL-based systems and the potential countermeasures to defend against these attacks. We first cover some fundamental backgrounds about DRL and present emerging adversarial attacks on machine learning techniques. We then investigate more details of the vulnerabilities that the adversary can exploit to attack DRL along with the state-of-the-art countermeasures to prevent such attacks. Finally, we highlight open issues and research challenges for developing solutions to deal with attacks for DRL-based intelligent systems.


Cut back on email if you want to fight global warming

The Japan Times

NEW YORK – Everyone has seen warnings at the end of email saying, "Please consider the environment before printing." But for those who care about global warming, you might want to consider not writing so many emails in the first place. More and more, people rely on their electronic mailboxes as a life organizer. Old emails, photos and files from years past sit undisturbed, awaiting your search for a name, lost address, or maybe a photo of an old boyfriend. The problem is that all those messages require energy to preserve them.


Bilevel Optimization for Differentially Private Optimization

arXiv.org Artificial Intelligence

This paper studies how to apply differential privacy to constrained optimization problems whose inputs are sensitive. This task raises significant challenges since random perturbations of the input data often render the constrained optimization problem infeasible or change significantly the nature of its optimal solutions. To address this difficulty, this paper proposes a bilevel optimization model that can be used as a post-processing step: It redistributes the noise introduced by a differentially private mechanism optimally while restoring feasibility and near-optimality. The paper shows that, under a natural assumption, this bilevel model can be solved efficiently for real-life large-scale nonlinear noncon-vex optimization problems with sensitive customer data. The experimental results demonstrate the accuracy of the privacy-preserving mechanism and showcase significant benefits compared to standard approaches. 1 Introduction Differential Privacy (DP) [ Dwork et al., 2006 ...


Naive Exploration is Optimal for Online LQR

arXiv.org Machine Learning

We consider the problem of online adaptive control of the linear quadratic regulator, where the true system parameters are unknown. We prove new upper and lower bounds demonstrating that the optimal regret scales as $\widetilde{\Theta}({\sqrt{d_{\mathbf{u}}^2 d_{\mathbf{x}} T}})$, where $T$ is the number of time steps, $d_{\mathbf{u}}$ is the dimension of the input space, and $d_{\mathbf{x}}$ is the dimension of the system state. Notably, our lower bounds rule out the possibility of a $\mathrm{poly}(\log{}T)$-regret algorithm, which has been conjectured due to the apparent strong convexity of the problem. Our upper bounds are attained by a simple variant of \emph{certainty equivalence control}, where the learner selects control inputs according to the optimal controller for their estimate of the system while injecting exploratory random noise. While this approach was shown to achieve $\sqrt{T}$-regret by Mania et al. 2019, we show that if the learner continually refines their estimates of the system matrices, the method attains optimal dimension dependence as well. Central to our upper and lower bounds is a new approach for controlling perturbations of Ricatti equations, which we call the \emph{self-bounding ODE method}. The approach enables regret upper bounds which hold for \emph{any stabilizable instance}, require no foreknowledge of the system except for a single stabilizing controller, and scale with natural control-theoretic quantities.


Comprehensive Analysis of Time Series Forecasting Using Neural Networks

arXiv.org Machine Learning

Time series forecasting has gained lots of attention recently; this is because many real-world phenomena can be modeled as time series. The massive volume of data and recent advancements in the processing power of the computers enable researchers to develop more sophisticated machine learning algorithms such as neural networks to forecast the time series data. In this paper, we propose various neural network architectures to forecast the time series data using the dynamic measurements; moreover, we introduce various architectures on how to combine static and dynamic measurements for forecasting. We also investigate the importance of performing techniques such as anomaly detection and clustering on forecasting accuracy. Our results indicate that clustering can improve the overall prediction time as well as improve the forecasting performance of the neural network. Furthermore, we show that feature-based clustering can outperform the distance-based clustering in terms of speed and efficiency. Finally, our results indicate that adding more predictors to forecast the target variable will not necessarily improve the forecasting accuracy.


Space Walk Underway For Final Fix Of International Space Station Device

NPR Technology

In this image take from NASA video, astronauts Christina Koch, left, moves away as Jessica Meir, right, exits a hatch as they prepare to install batteries for the International Space Station's solar power grid during a space walk, Monday, Jan. 20. In this image take from NASA video, astronauts Christina Koch, left, moves away as Jessica Meir, right, exits a hatch as they prepare to install batteries for the International Space Station's solar power grid during a space walk, Monday, Jan. 20. Two astronauts aboard the International Space Station began their fourth and final space walk early Saturday to finish a series of repairs aimed at extending the functioning of a cosmic ray detector attached to the spacecraft. The planned six-and-a-half-hour foray outside the space capsule began shortly after 7:00 a.m. ET and was being shown in a live video feed from NASA.


FogHorn Augments Edge Computing With Machine Learning To Bring Intelligence To Industrial IoT

#artificialintelligence

FogHorn, a Silicon Valley-based startup, is one of the early movers in the IIoT and edge computing market. The company has raised a total of $47.5M in funding over four rounds. The latest funding came from a Series B round in October 2017 by Intel Capital and Saudi Aramco Energy Ventures. Founded in 2014, FogHorn has been squarely focused on edge analytics and edge intelligence. According to the company, its solution enables high-performance edge processing, optimized analytics, and heterogeneous applications to be hosted as close as possible to the control systems and physical sensor infrastructure that pervade the industrial world.


A Lagrangian Dual Framework for Deep Neural Networks with Constraints

arXiv.org Machine Learning

A variety of computationally challenging constrained optimization problems in several engineering disciplines are solved repeatedly under different scenarios. In many cases, they would benefit from fast and accurate approximations, either to support real-time operations or large-scale simulation studies. This paper aims at exploring how to leverage the substantial data being accumulated by repeatedly solving instances of these applications over time. It introduces a deep learning model that exploits Lagrangian duality to encourage the satisfaction of hard constraints. The proposed method is evaluated on a collection of realistic energy networks, by enforcing non-discriminatory decisions on a variety of datasets, and on a transprecision computing application. The results illustrate the effectiveness of the proposed method that dramatically decreases constraint violations by the predictors and, in some applications, increases the prediction accuracy.


The role of surrogate models in the development of digital twins of dynamic systems

arXiv.org Machine Learning

Digital twin technology has significant promise, relevance and potential of widespread applicability in various industrial sectors such as aerospace, infrastructure and automotive. However, the adoption of this technology has been slower due to the lack of clarity for specific applications. A discrete damped dynamic system is used in this paper to explore the concept of a digital twin. As digital twins are also expected to exploit data and computational methods, there is a compelling case for the use of surrogate models in this context. Motivated by this synergy, we have explored the possibility of using surrogate models within the digital twin technology. In particular, the use of Gaussian process (GP) emulator within the digital twin technology is explored. GP has the inherent capability of addressing noise and sparse data and hence, makes a compelling case to be used within the digital twin framework. Cases involving stiffness variation and mass variation are considered, individually and jointly along with different levels of noise and sparsity in data. Our numerical simulation results clearly demonstrate that surrogate models such as GP emulators have the potential to be an effective tool for the development of digital twins. Aspects related to data quality and sampling rate are analysed. Key concepts introduced in this paper are summarised and ideas for urgent future research needs are proposed.


Learning Non-Markovian Reward Models in MDPs

arXiv.org Artificial Intelligence

There are situations in which an agent should receive rewards only after having accomplished a series of previous tasks. In other words, the reward that the agent receives is non-Markovian. One natural and quite general way to represent history-dependent rewards is via a Mealy machine; a finite state automaton that produces output sequences (rewards in our case) from input sequences (state/action observations in our case). In our formal setting, we consider a Markov decision process (MDP) that models the dynamic of the environment in which the agent evolves and a Mealy machine synchronised with this MDP to formalise the non-Markovian reward function. While the MDP is known by the agent, the reward function is unknown from the agent and must be learnt. Learning non-Markov reward functions is a challenge. Our approach to overcome this challenging problem is a careful combination of the Angluin's L* active learning algorithm to learn finite automata, testing techniques for establishing conformance of finite model hypothesis and optimisation techniques for computing optimal strategies in Markovian (immediate) reward MDPs. We also show how our framework can be combined with classical heuristics such as Monte Carlo Tree Search. We illustrate our algorithms and a preliminary implementation on two typical examples for AI.