Energy
Reinforcement learning with distance-based incentive/penalty (DIP) updates for highly constrained industrial control systems
Park, Hyungjun, Min, Daiki, Ryu, Jong-hyun, Choi, Dong Gu
Typical reinforcement learning (RL) methods show limited applicability for real-world industrial control problems because industrial systems involve various constraints and simultaneously require continuous and discrete control. To overcome these challenges, we devise a novel RL algorithm that enables an agent to handle a highly constrained action space. This algorithm has two main features. First, we devise two distance-based Q-value update schemes, incentive update and penalty update, in a distance-based incentive/penalty update technique to enable the agent to decide discrete and continuous actions in the feasible region and to update the value of these types of actions. Second, we propose a method for defining the penalty cost as a shadow price-weighted penalty. This approach affords two advantages compared to previous methods to efficiently induce the agent to not select an infeasible action. We apply our algorithm to an industrial control problem, microgrid system operation, and the experimental results demonstrate its superiority.
Continuous Ant-Based Neural Topology Search
ElSaid, AbdElRahman, Karns, Joshua, Lyu, Zimeng, Ororbia, Alexander, Desell, Travis
Manually optimizing artificial neural network (ANN) structures has been an obstacle to the advancement of machine learning as it is significantly time-consuming and requires a considerable level of domain expertise [1]. The structure of an ANN is typically chosen based on its reputation based on results of existent literature or based on knowledge shared across the machine learning community, however changing even a few problem-specific meta-parameters can lead to poor generalization upon committing to a specific topology [2, 3]. To address these challenges, a number of neural architecture search (NAS) [1, 4-8] and neuroevolution (NE) [9, 10] algorithms have been developed to automate the process of ANN design. More recently, nature-inspired neural architecture search (NINAS) algorithms have shown increasing promise, including the Artificial Bee Colony (ABC) optimization procedure [11], the Bat algorithm [12], the Firefly algorithm [13], and the Cuckoo Search algorithm [14]. Among the more recently successful applied NINAS strategies are those based on ant colony optimization (ACO) [15], which have proven to be particularly powerful when automating the design of recurrent neural networks (RNNs). Originally, ACO for NAS was limited to small structures based on Jordan and Elman RNNs [16] or was used as a process for reducing the number of network inputs [17]. Later work proposed generalizations of ACO for optimizing the synaptic connections of RNN memory cell structures [18] and even entire RNN architectures in an algorithmic framework called Ant-based Neural Topology Search (ANTS) [19]. In the ANTS process, ants traverse a single massively-connected "superstructure", which contains all of the possible ways that the nodes of an RNN may connect with each other, both in terms of structure (i.e., all possible feed forward connections), and in time (i.e., all possible recurrent synapses that span many different time delays), searching for optimal RNN sub-networks.
Researchers develop machine-learning optimizer to slash product design costs
Computer simulations are a critical part of the product design optimization process, allowing engineers to test various configurations and select the best design among the many different alternatives. But even at a facility like the U.S. Department of Energy's (DOE) Argonne National Laboratory, with its state-of-the-art resources, simulations can be very expensive and take a long time to run. With the goal of accelerating this design process, a research team in Argonne's Energy Systems (ES) division, comprised of postdoctoral appointee Opeoluwa Owoyele and research scientist Pinaki Pal, recently developed a new design optimization tool called ActivO. The new tool can drastically reduce the time needed to find the best design. It employs a novel machine learning technique that helps users focus on how to most efficiently target computational resources.
Climate change: Some areas of the Amazon could actually BENEFIT from warmer temperatures
Warmer temperatures may benefit parts of the Amazon rainforest, suggesting that the tropical ecosystem may be more resistant to climate change than once thought. It had previously been thought that water stress brought on by global warming and the drying out of the soil and air would broadly harm the plants of the Amazon. This would lead to reduced photosynthesis -- the chemical process by which plants make food and absorb in carbon dioxide -- and help accelerate climate change. However, US researchers found that wetter areas of the world's largest rainforest actually grow leaves more efficient at photosynthesis when exposed to dry air. The team warned that there is a limit to this, however, and that excessively warm temperatures would still cause damage to even these resilient parts of the forest.
A time of resiliency, change and innovation: How cloud-focused business strategies are driving transformation across industries - The Official Microsoft Blog
To help its service technicians more efficiently repair and maintain its models, Mercedes-Benz USA is outfitting all of its authorized American dealerships with HoloLens 2 headsets. The devices are equipped with Microsoft Dynamics 365 Remote Assist, a mixed reality app that that lets users collaborate during hands-free video calls from their own computers. Organizations have long known the importance of business resiliency, but becoming resilient requires time and preparation, and the pandemic has forced many organizations to evolve at a pace few could have imagined. To recover and thrive within this new context presents new challenges. That is why we are partnering with customers to support faster adoption of digital capabilities.
Flame on! How AI may tame a complex materials technique and transform manufacturing
Creating nanomaterials with flame spray pyrolysis is complex, but scientists at Argonne have discovered how applying artificial intelligence can lead to an easier process and better performance. During a tour of the Manufacturing and Engineering Research Facility at the U.S. Department of Energy's Argonne National Laboratory, Marius Stan, the Intelligent Materials Design lead in Argonne's Applied Materials Division (AMD), encountered a new experimental setup. As he watched the machine in the experiment, which relies on flame to produce nanomaterials, he had a thought: Could artificial intelligence be used to optimize this complex process? When asked to explain the process, Stan put it simply: "It's where scientists put chemicals in a flame and wait for a miracle--for particles to appear at the end of the process, particles that have important properties for a variety of applications." Flame spray pyrolysis is a technology that enables the manufacturing of nanomaterials in high volumes, which in turn is critical to producing a wide range of industrial materials, like chemical catalysts, battery electrolytes/cathodes and pigments.
Artificial Intelligence: A Workmate for the Human Resource Department
We are accelerating fast into an Artificial Intelligence (AI) driven digital era. Not a moment goes when digital is not part of our daily lives. And that's not just about smart devices at home or collaborating on MS Teams or Zoom meetings but extends to cars we drive, payments we make or shopping we do. While so much of our lives are surrounded and enhanced by digital experiences, when it comes to the most crucial resource that helps companies achieve goals and scale to new heights, that is human resources, AI is a tiny component. It will be a pity if we can't extend and use the very tools that make our lives so much better when it comes to talent or human resources management.
Seismic Facies Analysis: A Deep Domain Adaptation Approach
Nasim, M Quamer, Maiti, Tannistha, Shrivastava, Ayush, Singh, Tarry, Mei, Jie
Deep neural networks (DNNs) can learn accurately from large quantities of labeled input data, but DNNs sometimes fail to generalize to test data sampled from different input distributions. Unsupervised Deep Domain Adaptation (DDA) proves useful when no input labels are available, and distribution shifts are observed in the target domain (TD). Experiments are performed on seismic images of the F3 block 3D dataset from offshore Netherlands (source domain; SD) and Penobscot 3D survey data from Canada (target domain; TD). Three geological classes from SD and TD that have similar reflection patterns are considered. In the present study, an improved deep neural network architecture named EarthAdaptNet (EAN) is proposed to semantically segment the seismic images. We specifically use a transposed residual unit to replace the traditional dilated convolution in the decoder block. The EAN achieved a pixel-level accuracy >84% and an accuracy of ~70% for the minority classes, showing improved performance compared to existing architectures. In addition, we introduced the CORAL (Correlation Alignment) method to the EAN to create an unsupervised deep domain adaptation network (EAN-DDA) for the classification of seismic reflections fromF3 and Penobscot. Maximum class accuracy achieved was ~99% for class 2 of Penobscot with >50% overall accuracy. Taken together, EAN-DDA has the potential to classify target domain seismic facies classes with high accuracy.
Double Meta-Learning for Data Efficient Policy Optimization in Non-Stationary Environments
Aghapour, Elahe, Ayanian, Nora
We are interested in learning models of non-stationary environments, which can be framed as a multi-task learning problem. Model-free reinforcement learning algorithms can achieve good asymptotic performance in multi-task learning at a cost of extensive sampling, due to their approach, which requires learning from scratch. While model-based approaches are among the most data efficient learning algorithms, they still struggle with complex tasks and model uncertainties. Meta-reinforcement learning addresses the efficiency and generalization challenges on multi task learning by quickly leveraging the meta-prior policy for a new task. In this paper, we propose a meta-reinforcement learning approach to learn the dynamic model of a non-stationary environment to be used for meta-policy optimization later. Due to the sample efficiency of model-based learning methods, we are able to simultaneously train both the meta-model of the non-stationary environment and the meta-policy until dynamic model convergence. Then, the meta-learned dynamic model of the environment will generate simulated data for meta-policy optimization. Our experiment demonstrates that our proposed method can meta-learn the policy in a non-stationary environment with the data efficiency of model-based learning approaches while achieving the high asymptotic performance of model-free meta-reinforcement learning.
Online Learning Based Risk-Averse Stochastic MPC of Constrained Linear Uncertain Systems
This paper investigates the problem of designing data-driven stochastic Model Predictive Control (MPC) for linear time-invariant systems under additive stochastic disturbance, whose probability distribution is unknown but can be partially inferred from data. We propose a novel online learning based risk-averse stochastic MPC framework in which Conditional Value-at-Risk (CVaR) constraints on system states are required to hold for a family of distributions called an ambiguity set. The ambiguity set is constructed from disturbance data by leveraging a Dirichlet process mixture model that is self-adaptive to the underlying data structure and complexity. Specifically, the structural property of multimodality is exploit-ed, so that the first- and second-order moment information of each mixture component is incorporated into the ambiguity set. A novel constraint tightening strategy is then developed based on an equivalent reformulation of distributionally ro-bust CVaR constraints over the proposed ambiguity set. As more data are gathered during the runtime of the controller, the ambiguity set is updated online using real-time disturbance data, which enables the risk-averse stochastic MPC to cope with time-varying disturbance distributions. The online variational inference algorithm employed does not require all collected data be learned from scratch, and therefore the proposed MPC is endowed with the guaranteed computational complexity of online learning. The guarantees on recursive feasibility and closed-loop stability of the proposed MPC are established via a safe update scheme. Numerical examples are used to illustrate the effectiveness and advantages of the proposed MPC.