Energy
Accelerating Reinforcement Learning with a Directional-Gaussian-Smoothing Evolution Strategy
Zhang, Jiaxing, Tran, Hoang, Zhang, Guannan
Evolution strategy (ES) has been shown great promise in many challenging reinforcement learning (RL) tasks, rivaling other state-of-the-art deep RL methods. Yet, there are two limitations in the current ES practice that may hinder its otherwise further capabilities. First, most current methods rely on Monte Carlo type gradient estimators to suggest search direction, where the policy parameter is, in general, randomly sampled. Due to the low accuracy of such estimators, the RL training may suffer from slow convergence and require more iterations to reach optimal solution. Secondly, the landscape of reward functions can be deceptive and contains many local maxima, causing ES algorithms to prematurely converge and be unable to explore other parts of the parameter space with potentially greater rewards. In this work, we employ a Directional Gaussian Smoothing Evolutionary Strategy (DGS-ES) to accelerate RL training, which is well-suited to address these two challenges with its ability to i) provide gradient estimates with high accuracy, and ii) find nonlocal search direction which lays stress on large-scale variation of the reward function and disregards local fluctuation. Through several benchmark RL tasks demonstrated herein, we show that DGS-ES is highly scalable, possesses superior wall-clock time, and achieves competitive reward scores to other popular policy gradient and ES approaches.
Devices found in Houthi missiles and Yemen drones link Iran to attacks
DUBAI, UNITED ARAB EMIRATES โ A small instrument inside the drones that targeted the heart of Saudi Arabia's oil industry and those in the arsenal of Yemen's Houthi rebels match components recovered in downed Iranian drones in Afghanistan and Iraq, two reports say. These gyroscopes have only been found inside drones manufactured by Iran, Conflict Armament Research said in a report released on Wednesday. That follows a recently released report from the United Nations saying its experts saw a similar gyroscope from an Iranian drone obtained by the U.S. military in Afghanistan, as well as in weapons shipments seized in the Arabian Sea bound for Yemen. The discovery further ties Iran to an attack that briefly halved Saudi Arabia's oil output and saw energy prices spike by a level unseen since the 1991 Gulf War. It also ties Iran to the arming of the rebel Houthis in Yemen's long civil war.
Beyond Forecasting: Artificial Intelligence Is a Powerful Decarbonization Tool
For the first time, artificial-intelligence experts have created a place to collaborate, Climate Change AI. Some are mining the massive data of remotely-sensed emissions streams. Some are accelerating materials discovery for solar fuels by combining machine learning and physics to figure out a proposed material's crystal structure. Some are using system optimization to consolidate freight and route it more efficiently. Others are deploying agricultural robots armed with spectral cameras in hope of reducing fertilizer use in farming.
Multi-Agent Meta-Reinforcement Learning for Self-Powered and Sustainable Edge Computing Systems
Munir, Md. Shirajum, Tran, Nguyen H., Saad, Walid, Hong, Choong Seon
The stringent requirements of mobile edge computing (MEC) applications and functions fathom the high capacity and dense deployment of MEC hosts to the upcoming wireless networks. However, operating such high capacity MEC hosts can significantly increase energy consumption. Thus, a BS unit can act as a self-powered BS. In this paper, an effective energy dispatch mechanism for self-powered wireless networks with edge computing capabilities is studied. First, a two-stage linear stochastic programming problem is formulated with the goal of minimizing the total energy consumption cost of the system while fulfilling the energy demand. Second, a semi-distributed data-driven solution is proposed by developing a novel multi-agent meta-reinforcement learning (MAMRL) framework to solve the formulated problem. In particular, each BS plays the role of a local agent that explores a Markovian behavior for both energy consumption and generation while each BS transfers time-varying features to a meta-agent. Sequentially, the meta-agent optimizes (i.e., exploits) the energy dispatch decision by accepting only the observations from each local agent with its own state information. Meanwhile, each BS agent estimates its own energy dispatch policy by applying the learned parameters from meta-agent. Finally, the proposed MAMRL framework is benchmarked by analyzing deterministic, asymmetric, and stochastic environments in terms of non-renewable energy usages, energy cost, and accuracy. Experimental results show that the proposed MAMRL model can reduce up to 11% non-renewable energy usage and by 22.4% the energy cost (with 95.8% prediction accuracy), compared to other baseline methods.
Federated Learning in the Sky: Joint Power Allocation and Scheduling with UAV Swarms
Zeng, Tengchan, Semiari, Omid, Mozaffari, Mohammad, Chen, Mingzhe, Saad, Walid, Bennis, Mehdi
Unmanned aerial vehicle (UAV) swarms must exploit machine learning (ML) in order to execute various tasks ranging from coordinated trajectory planning to cooperative target recognition. However, due to the lack of continuous connections between the UAV swarm and ground base stations (BSs), using centralized ML will be challenging, particularly when dealing with a large volume of data. In this paper, a novel framework is proposed to implement distributed federated learning (FL) algorithms within a UAV swarm that consists of a leading UAV and several following UAVs. Each following UAV trains a local FL model based on its collected data and then sends this trained local model to the leading UAV who will aggregate the received models, generate a global FL model, and transmit it to followers over the intra-swarm network. To identify how wireless factors, like fading, transmission delay, and UAV antenna angle deviations resulting from wind and mechanical vibrations, impact the performance of FL, a rigorous convergence analysis for FL is performed. Then, a joint power allocation and scheduling design is proposed to optimize the convergence rate of FL while taking into account the energy consumption during convergence and the delay requirement imposed by the swarm's control system. Simulation results validate the effectiveness of the FL convergence analysis and show that the joint design strategy can reduce the number of communication rounds needed for convergence by as much as 35% compared with the baseline design.
Trust Your Model: Iterative Label Improvement and Robust Training by Confidence Based Filtering and Dataset Partitioning
Haase-Schรผtz, Christian, Stal, Rainer, Hertlein, Heinz, Sick, Bernhard
State-of-the-art, high capacity deep neural networks not only require large amounts of labelled training data, they are also highly susceptible to label errors in this data, typically resulting in large efforts and costs and therefore limiting the applicability of deep learning. To alleviate this issue, we propose a novel meta training and labelling scheme that is able to use inexpensive unlabelled data by taking advantage of the generalization power of deep neural networks. We show experimentally that by solely relying on one network architecture and our proposed scheme of iterative training and prediction steps, both label quality and resulting model accuracy can be improved significantly. Our method achieves state-of-the-art results, while being architecture agnostic and therefore broadly applicable. Compared to other methods dealing with erroneous labels, our approach does neither require another network to be trained, nor does it necessarily need an additional, highly accurate reference label set. Instead of removing samples from a labelled set, our technique uses additional sensor data without the need for manual labelling.
French energy giants EDF and Total bet on artificial intelligence
Electrical systems company Thales, state-owned electric utility EDF and oil major Total were among the eight French signatories of a manifesto for an artificial intelligence (AI) industry launched on July 3 at the Ministry of the Economy and Finance. The manifesto was intended to promote research and development resources to make AI a source of growth and jobs across industrial sectors within an ethical framework. With that commitment in mind, the fossil fuel companies and Thales have announced the opening of an AI industrial research laboratory. The work to be carried out at the EDF Lab Paris-Saclay research and training center will focus on "AI technologies adapted to the needs of critical industrial systems", namely, vulnerable systems where malfunctions could have serious consequences. Among such systems, EDF cited aeronautical applications and energy production facilities.
Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm
We consider an extension of the rollout algorithm that applies to constrained deterministic dynamic programming, including challenging combinatorial optimization problems. The algorithm relies on a suboptimal policy, called base heuristic. Under suitable assumptions, we show that if the base heuristic produces a feasible solution, the rollout algorithm has a cost improvement property: it produces a feasible solution, whose cost is no worse than the base heuristic's cost. We then focus on multiagent problems, where the control at each stage consists of multiple components (one per agent), which are coupled either through the cost function or the constraints or both. We show that the cost improvement property is maintained with an alternative implementation that has greatly reduced computational requirements, and makes possible the use of rollout in problems with many agents. We demonstrate this alternative algorithm by applying it to layered graph problems that involve both a spatial and a temporal structure. We consider in some detail a prominent example of such problems: multidimensional assignment, where we use the auction algorithm for 2-dimensional assignment as a base heuristic. This auction algorithm is particularly well-suited for our context, because through the use of prices, it can advantageously use the solution of an assignment problem as a starting point for solving other related assignment problems, and this can greatly speed up the execution of the rollout algorithm.
A Wasserstein Minimum Velocity Approach to Learning Unnormalized Models
Wang, Ziyu, Cheng, Shuyu, Li, Yueru, Zhu, Jun, Zhang, Bo
Score matching provides an effective approach to learning flexible unnormalized models, but its scalability is limited by the need to evaluate a second-order derivative. In this paper, we present a scalable approximation to a general family of learning objectives including score matching, by observing a new connection between these objectives and Wasserstein gradient flows. We present applications with promise in learning neural density estimators on manifolds, and training implicit variational and Wasserstein auto-encoders with a manifold-valued prior.
ESG investments: Filtering versus machine learning approaches
de Franco, Carmine, Geissler, Christophe, Margot, Vincent, Monnier, Bruno
We designed a machine learning algorithm that identifies patterns between ESG profiles and financial performances for companies in a large investment universe. The algorithm consists of regularly updated sets of rules that map regions into the high-dimensional space of ESG features to excess return predictions. The final aggregated predictions are transformed into scores which allow us to design simple strategies that screen the investment universe for stocks with positive scores. By linking the ESG features with financial performances in a non-linear way, our strategy based upon our machine learning algorithm turns out to be an efficient stock picking tool, which outperforms classic strategies that screen stocks according to their ESG ratings, as the popular best-in-class approach. Our paper brings new ideas in the growing field of financial literature that investigates the links between ESG behavior and the economy. We show indeed that there is clearly some form of alpha in the ESG profile of a company, but that this alpha can be accessed only with powerful, non-linear techniques such as machine learning.