Goto

Collaborating Authors

 Energy


Modern Hopfield Networks for Few- and Zero-Shot Reaction Prediction

arXiv.org Artificial Intelligence

An essential step in the discovery of new drugs and materials is the synthesis of a molecule that exists so far only as an idea to test its biological and physical properties. While computer-aided design of virtual molecules has made large progress, computer-assisted synthesis planning (CASP) to realize physical molecules is still in its infancy and lacks a performance level that would enable large-scale molecule discovery. CASP supports the search for multi-step synthesis routes, which is very challenging due to high branching factors in each synthesis step and the hidden rules that govern the reactions. The central and repeatedly applied step in CASP is reaction prediction, for which machine learning methods yield the best performance. We propose a novel reaction prediction approach that uses a deep learning architecture with modern Hopfield networks (MHNs) that is optimized by contrastive learning. An MHN is an associative memory that can store and retrieve chemical reactions in each layer of a deep learning architecture. We show that our MHN contrastive learning approach enables few- and zero-shot learning for reaction prediction which, in contrast to previous methods, can deal with rare, single, or even no training example(s) for a reaction. On a well established benchmark, our MHN approach pushes the state-of-the-art performance up by a large margin as it improves the predictive top-100 accuracy from $0.858\pm0.004$ to $0.959\pm0.004$. This advance might pave the way to large-scale molecule discovery.


Empowering Prosumer Communities in Smart Grid with Wireless Communications and Federated Edge Learning

arXiv.org Artificial Intelligence

The exponential growth of distributed energy resources is enabling the transformation of traditional consumers in the smart grid into prosumers. Such transition presents a promising opportunity for sustainable energy trading. Yet, the integration of prosumers in the energy market imposes new considerations in designing unified and sustainable frameworks for efficient use of the power and communication infrastructure. Furthermore, several issues need to be tackled to adequately promote the adoption of decentralized renewable-oriented systems, such as communication overhead, data privacy, scalability, and sustainability. In this article, we present the different aspects and challenges to be addressed for building efficient energy trading markets in relation to communication and smart decision-making. Accordingly, we propose a multi-level pro-decision framework for prosumer communities to achieve collective goals. Since the individual decisions of prosumers are mainly driven by individual self-sufficiency goals, the framework prioritizes the individual prosumers' decisions and relies on 5G wireless network for fast coordination among community members. In fact, each prosumer predicts energy production and consumption to make proactive trading decisions as a response to collective-level requests. Moreover, the collaboration of the community is further extended by including the collaborative training of prediction models using Federated Learning, assisted by edge servers and prosumer home-area equipment. In addition to preserving prosumers' privacy, we show through evaluations that training prediction models using Federated Learning yields high accuracy for different energy resources while reducing the communication overhead.


Researchers use AI to estimate focal mechanism parameters of earthquake

#artificialintelligence

The research team led by Prof. Zhang Jie from the University of Science and Technology of China (USTC) of the Chinese Academy of Sciences made progress on real-time determination of earthquake focal mechanisms through deep learning. The work was published in Nature Communications. Since there are connections between characteristics of the rupture surface of the source fault and seismic wave radiated by the source, it's vital to monitor the earthquake by immediate determination of the source focal mechanism which is inferred from multiple ground seismic records. However, it's hard to calculate the mechanism from the simple records. The parameters about focal mechanisms are either merely reported or reported after a few minutes or even longer.


Nonlinear Model Based Guidance with Deep Learning Based Target Trajectory Prediction Against Aerial Agile Attack Patterns

arXiv.org Artificial Intelligence

In this work, we propose a novel missile guidance algorithm that combines deep learning based trajectory prediction with nonlinear model predictive control. Although missile guidance and threat interception is a well-studied problem, existing algorithms' performance degrades significantly when the target is pulling high acceleration attack maneuvers while rapidly changing its direction. We argue that since most threats execute similar attack maneuvers, these nonlinear trajectory patterns can be processed with modern machine learning methods to build high accuracy trajectory prediction algorithms. We train a long short-term memory network (LSTM) based on a class of simulated structured agile attack patterns, then combine this predictor with quadratic programming based nonlinear model predictive control (NMPC). Our method, named nonlinear model based predictive control with target acceleration predictions (NMPC-TAP), significantly outperforms compared approaches in terms of miss distance, for the scenarios where the target/threat is executing agile maneuvers.


gradSim: Differentiable simulation for system identification and visuomotor control

arXiv.org Artificial Intelligence

We consider the problem of estimating an object's physical properties such as mass, friction, and elasticity directly from video sequences. Such a system identification problem is fundamentally ill-posed due to the loss of information during image formation. Current solutions require precise 3D labels which are labor-intensive to gather, and infeasible to create for many systems such as deformable solids or cloth. We present gradSim, a framework that overcomes the dependence on 3D supervision by leveraging differentiable multiphysics simulation and differentiable rendering to jointly model the evolution of scene dynamics and image formation. This novel combination enables backpropagation from pixels in a video sequence through to the underlying physical attributes that generated them. Moreover, our unified computation graph -- spanning from the dynamics and through the rendering process -- enables learning in challenging visuomotor control tasks, without relying on state-based (3D) supervision, while obtaining performance competitive to or better than techniques that rely on precise 3D labels.


GEM: Group Enhanced Model for Learning Dynamical Control Systems

arXiv.org Artificial Intelligence

Learning the dynamics of a physical system wherein an autonomous agent operates is an important task. Often these systems present apparent geometric structures. For instance, the trajectories of a robotic manipulator can be broken down into a collection of its transitional and rotational motions, fully characterized by the corresponding Lie groups and Lie algebras. In this work, we take advantage of these structures to build effective dynamical models that are amenable to sample-based learning. We hypothesize that learning the dynamics on a Lie algebra vector space is more effective than learning a direct state transition model. To verify this hypothesis, we introduce the Group Enhanced Model (GEM). GEMs significantly outperform conventional transition models on tasks of long-term prediction, planning, and model-based reinforcement learning across a diverse suite of standard continuous-control environments, including Walker, Hopper, Reacher, Half-Cheetah, Inverted Pendulums, Ant, and Humanoid. Furthermore, plugging GEM into existing state of the art systems enhances their performance, which we demonstrate on the PETS system. This work sheds light on a connection between learning of dynamics and Lie group properties, which opens doors for new research directions and practical applications along this direction. Our code is publicly available at: https://tinyurl.com/GEMMBRL.


Deep learning for prediction of complex geology ahead of drilling

arXiv.org Machine Learning

During a geosteering operation the well path is intentionally adjusted in response to the new data acquired while drilling. To achieve consistent high-quality decisions, especially when drilling in complex environments, decision support systems can help cope with high volumes of data and interpretation complexities. They can assimilate the real-time measurements into a probabilistic earth model and use the updated model for decision recommendations. Recently, machine learning (ML) techniques have enabled a wide range of methods that redistribute computational cost from on-line to off-line calculations. In this paper, we introduce two ML techniques into the geosteering decision support framework. Firstly, a complex earth model representation is generated using a Generative Adversarial Network (GAN). Secondly, a commercial extra-deep electromagnetic simulator is represented using a Forward Deep Neural Network (FDNN). The numerical experiments demonstrate that the combination of the GAN and the FDNN in an ensemble randomized maximum likelihood data assimilation scheme provides real-time estimates of complex geological uncertainty. This yields reduction of geological uncertainty ahead of the drill-bit from the measurements gathered behind and around the well bore.


Neural Network-based Control for Multi-Agent Systems from Spatio-Temporal Specifications

arXiv.org Artificial Intelligence

We propose a framework for solving control synthesis problems for multi-agent networked systems required to satisfy spatio-temporal specifications. We use Spatio-Temporal Reach and Escape Logic (STREL) as a specification language. For this logic, we define smooth quantitative semantics, which captures the degree of satisfaction of a formula by a multi-agent team. We use the novel quantitative semantics to map control synthesis problems with STREL specifications to optimization problems and propose a combination of heuristic and gradient-based methods to solve such problems. As this method might not meet the requirements of a real-time implementation, we develop a machine learning technique that uses the results of the off-line optimizations to train a neural network that gives the control inputs at current states. We illustrate the effectiveness of the proposed framework by applying it to a model of a robotic team required to satisfy a spatial-temporal specification under communication constraints.


A non-asymptotic penalization criterion for model selection in mixture of experts models

arXiv.org Artificial Intelligence

Mixture of experts (MoE) is a popular class of models in statistics and machine learning that has sustained attention over the years, due to its flexibility and effectiveness. We consider the Gaussian-gated localized MoE (GLoME) regression model for modeling heterogeneous data. This model poses challenging questions with respect to the statistical estimation and model selection problems, including feature selection, both from the computational and theoretical points of view. We study the problem of estimating the number of components of the GLoME model, in a penalized maximum likelihood estimation framework. We provide a lower bound on the penalty that ensures a weak oracle inequality is satisfied by our estimator. To support our theoretical result, we perform numerical experiments on simulated and real data, which illustrate the performance of our finite-sample oracle inequality.


Ensemble deep learning: A review

arXiv.org Artificial Intelligence

Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning models with multilayer processing architecture is showing better performance as compared to the shallow or traditional classification models. Deep ensemble learning models combine the advantages of both the deep learning models as well as the ensemble learning such that the final model has better generalization performance. This paper reviews the state-of-art deep ensemble models and hence serves as an extensive summary for the researchers. The ensemble models are broadly categorised into ensemble models like bagging, boosting and stacking, negative correlation based deep ensemble models, explicit/implicit ensembles, homogeneous /heterogeneous ensemble, decision fusion strategies, unsupervised, semi-supervised, reinforcement learning and online/incremental, multilabel based deep ensemble models. Application of deep ensemble models in different domains is also briefly discussed. Finally, we conclude this paper with some future recommendations and research directions.