dmap
DMAP: a Distributed Morphological Attention Policy for learning to locomote with a changing body
Biological and artificial agents need to deal with constant changes in the real world. We study this problem in four classical continuous control environments, augmented with morphological perturbations. Learning to locomote when the length and the thickness of different body parts vary is challenging, as the control policy is required to adapt to the morphology to successfully balance and advance the agent. We show that a control policy based on the proprioceptive state performs poorly with highly variable body configurations, while an (oracle) agent with access to a learned encoding of the perturbation performs significantly better. We introduce DMAP, a biologically-inspired, attention-based policy network architecture. DMAP combines independent proprioceptive processing, a distributed policy with individual controllers for each joint, and an attention mechanism, to dynamically gate sensory information from different body parts to different controllers. Despite not having access to the (hidden) morphology information, DMAP can be trained end-to-end in all the considered environments, overall matching or surpassing the performance of an oracle agent. Thus DMAP, implementing principles from biological motor control, provides a strong inductive bias for learning challenging sensorimotor tasks.
A Appendix
The numbers in bold denote a significant statistical difference between the two methods (p-value < 0.001, paired t-test). We also list the IID (Table T6) and OOD (Tables T7, T8 and T9) test results of all the agents trained for this work. Some negative values should not surprise the reader, as some agents, when tested way outside of the training distribution, fail to walk, collecting more penalties (e.g., due to undesired contact force or excessive energy expenditure) than positive reward. We also show the graphs of the reward as a function for different perturbation intensity for the end-to-end trained Oracle, DMAP and TCN (Figure F2). Generally, DMAP performs similarly to the Oracle, while the TCN has lower performance especially for more challenging morphologies (Ant, Walker).
DMAP: a Distributed Morphological Attention Policy for Learning to Locomote with a Changing Body
Biological and artificial agents need to deal with constant changes in the real world. We study this problem in four classical continuous control environments, augmented with morphological perturbations. Learning to locomote when the length and the thickness of different body parts vary is challenging, as the control policy is required to adapt to the morphology to successfully balance and advance the agent. We show that a control policy based on the proprioceptive state performs poorly with highly variable body configurations, while an (oracle) agent with access to a learned encoding of the perturbation performs significantly better. We introduce DMAP, a biologically-inspired, attention-based policy network architecture. DMAP combines independent proprioceptive processing, a distributed policy with individual controllers for each joint, and an attention mechanism, to dynamically gate sensory information from different body parts to different controllers. Despite not having access to the (hidden) morphology information, DMAP can be trained end-to-end in all the considered environments, overall matching or surpassing the performance of an oracle agent. Thus DMAP, implementing principles from biological motor control, provides a strong inductive bias for learning challenging sensorimotor tasks.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
DMAP: a Distributed Morphological Attention Policy for learning to locomote with a changing body
Biological and artificial agents need to deal with constant changes in the real world. We study this problem in four classical continuous control environments, augmented with morphological perturbations. Learning to locomote when the length and the thickness of different body parts vary is challenging, as the control policy is required to adapt to the morphology to successfully balance and advance the agent. We show that a control policy based on the proprioceptive state performs poorly with highly variable body configurations, while an (oracle) agent with access to a learned encoding of the perturbation performs significantly better. We introduce DMAP, a biologically-inspired, attention-based policy network architecture.
Locomotion modeling evolves with brain-inspired neural networks - EPFL
A team of scientists at EPFL have built a new neural network system that can help understand how animals adapt their movement to changes in their own body and to create more powerful artificial intelligence systems. Deep learning has been fueled by artificial neural networks, which stack simple computational elements on top of each other, to create powerful learning systems. Given enough data, these systems can solve challenging tasks like recognize objects, beat human's at Go and also control robots. "As you can imagine, the architecture of how you stack these elements on top of each other might influence how much data you need to learn and what the ceiling performance is," says Professor Alexander Mathis at EPFL's School of Life Sciences. Working with doctoral students Alberto Chiappa and Alessandro Marin Vargas, the three scientists have developed a new network architecture called DMAP for "Distributed Morphological Attention Policy".
DMAP: a Distributed Morphological Attention Policy for Learning to Locomote with a Changing Body
Chiappa, Alberto Silvio, Vargas, Alessandro Marin, Mathis, Alexander
Biological and artificial agents need to deal with constant changes in the real world. We study this problem in four classical continuous control environments, augmented with morphological perturbations. Learning to locomote when the length and the thickness of different body parts vary is challenging, as the control policy is required to adapt to the morphology to successfully balance and advance the agent. We show that a control policy based on the proprioceptive state performs poorly with highly variable body configurations, while an (oracle) agent with access to a learned encoding of the perturbation performs significantly better. We introduce DMAP, a biologically-inspired, attention-based policy network architecture. DMAP combines independent proprioceptive processing, a distributed policy with individual controllers for each joint, and an attention mechanism, to dynamically gate sensory information from different body parts to different controllers. Despite not having access to the (hidden) morphology information, DMAP can be trained end-to-end in all the considered environments, overall matching or surpassing the performance of an oracle agent. Thus DMAP, implementing principles from biological motor control, provides a strong inductive bias for learning challenging sensorimotor tasks. Overall, our work corroborates the power of these principles in challenging locomotion tasks.
Online Change Point Detection in Molecular Dynamics With Optical Random Features
Chatelain, Amélie, Tommasone, Elena, Daudet, Laurent, Poli, Iacopo
Proteins are made of atoms constantly fluctuating, but can occasionally undergo large-scale changes. Such transitions are of biological interest, linking the structure of a protein to its function with a cell. Atomic-level simulations, such as Molecular Dynamics (MD), are used to study these events. However, molecular dynamics simulations produce time series with multiple observables, while changes often only affect a few of them. Therefore, detecting conformational changes has proven to be challenging for most change-point detection algorithms. In this work, we focus on the identification of such events given many noisy observables. In particular, we show that the No-prior-Knowledge Exponential Weighted Moving Average (NEWMA) algorithm can be used along optical hardware to successfully identify these changes in real-time. Our method does not need to distinguish between the background of a protein and the protein itself. For larger simulations, it is faster than using traditional silicon hardware and has a lower memory footprint. This technique may enhance the sampling of the conformational space of molecules. It may also be used to detect change-points in other sequential data with a large number of features.
Graph Pattern Entity Ranking Model for Knowledge Graph Completion
Ebisu, Takuma, Ichise, Ryutaro
Knowledge graphs have evolved rapidly in recent years and their usefulness has been demonstrated in many artificial intelligence tasks. However, knowledge graphs often have lots of missing facts. To solve this problem, many knowledge graph embedding models have been developed to populate knowledge graphs and these have shown outstanding performance. However, knowledge graph embedding models are so-called black boxes, and the user does not know how the information in a knowledge graph is processed and the models can be difficult to interpret. In this paper, we utilize graph patterns in a knowledge graph to overcome such problems. Our proposed model, the {\it graph pattern entity ranking model} (GRank), constructs an entity ranking system for each graph pattern and evaluates them using a ranking measure. By doing so, we can find graph patterns which are useful for predicting facts. Then, we perform link prediction tasks on standard datasets to evaluate our GRank method. We show that our approach outperforms other state-of-the-art approaches such as ComplEx and TorusE for standard metrics such as HITS@{\it n} and MRR. Moreover, our model is easily interpretable because the output facts are described by graph patterns.