AITopics | Merel, Josh

Collaborating Authors

Merel, Josh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Emergent Coordination Through Competition

Liu, Siqi, Lever, Guy, Merel, Josh, Tunyasuvunakool, Saran, Heess, Nicolas, Graepel, Thore

arXiv.org Artificial IntelligenceFeb-21-2019

We study the emergence of cooperative behaviors in reinforcement learning agents by introducing a challenging competitive multi-agent soccer environment with continuous simulated physics. We demonstrate that decentralized, population-based training with co-play can lead to a progression in agents' behaviors: from random, to simple ball chasing, and finally showing evidence of cooperation. Our study highlights several of the challenges encountered in large scale multi-agent training in continuous control. In particular, we demonstrate that the automatic optimization of simple shaping rewards, not themselves conducive to co-operative behavior, can lead to long-horizon team behavior. We further apply an evaluation scheme, grounded by game theoretic principals, that can assess agent performance in the absence of pre-defined evaluation tasks or human baselines.

agent, deep learning, soccer, (21 more...)

arXiv.org Artificial Intelligence

1902.07151

Country:

Europe (1.00)
North America > United States > Louisiana (0.14)
North America > United States > Texas (0.14)
(2 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Sports > Soccer (0.68)
Leisure & Entertainment > Games > Chess (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Neural probabilistic motor primitives for humanoid control

Merel, Josh, Hasenclever, Leonard, Galashov, Alexandre, Ahuja, Arun, Pham, Vu, Wayne, Greg, Teh, Yee Whye, Heess, Nicolas

arXiv.org Artificial IntelligenceNov-28-2018

We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids. To do this, we propose a motor architecture that has the general structure of an inverse model with a latent-variable bottleneck. We show that it is possible to train this model entirely offline to compress thousands of expert policies and learn a motor primitive embedding space. The trained neural probabilistic motor primitive system can perform one-shot imitation of whole-body humanoid behaviors, robustly mimicking unseen trajectories. Additionally, we demonstrate that it is also straightforward to train controllers to reuse the learned motor primitive space to solve tasks, and the resulting movements are relatively naturalistic. To support the training of our model, we compare two approaches for offline policy cloning, including an experience efficient method which we call linear feedback policy cloning. We encourage readers to view a supplementary video summarizing our results ( https://youtu.be/1NAHsrrH2t0 ).

artificial intelligence, neural network, trajectory, (16 more...)

arXiv.org Artificial Intelligence

1811.11711

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)

Add feedback

Hierarchical visuomotor control of humanoids

Merel, Josh, Ahuja, Arun, Pham, Vu, Tunyasuvunakool, Saran, Liu, Siqi, Tirumala, Dhruva, Heess, Nicolas, Wayne, Greg

arXiv.org Artificial IntelligenceNov-23-2018

We aim to build complex humanoid agents that integrate perception, motor control, and memory. In this work, we partly factor this problem into low-level motor control from proprioception and high-level coordination of the low-level skills informed by vision. We develop an architecture capable of surprisingly flexible, task-directed motor control of a relatively high-DoF humanoid body by combining pre-training of low-level motor controllers with a high-level, task-focused controller that switches among low-level sub-policies. The resulting system is able to control a physically-simulated humanoid body to solve tasks that require coupling visual perception from an unstabilized egocentric RGB camera during locomotion in the environment. For a supplementary video link, see https://youtu.be/7GISvfbykLE .

deep learning, fragment, neural network, (18 more...)

arXiv.org Artificial Intelligence

1811.09656

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Robots (0.94)
(2 more...)

Add feedback

Graph networks as learnable physics engines for inference and control

Sanchez-Gonzalez, Alvaro, Heess, Nicolas, Springenberg, Jost Tobias, Merel, Josh, Riedmiller, Martin, Hadsell, Raia, Battaglia, Peter

arXiv.org Machine LearningJun-4-2018

Understanding and interacting with everyday physical scenes requires rich knowledge about the structure of the world, represented either implicitly in a value or policy function, or explicitly in a transition model. Here we introduce a new class of learnable models--based on graph networks--which implement an inductive bias for object- and relation-centric representations of complex, dynamical systems. Our results show that as a forward model, our approach supports accurate predictions from real and simulated data, and surprisingly strong and efficient generalization, across eight distinct physical systems which we varied parametrically and structurally. We also found that our inference model can perform system identification. Our models are also differentiable, and support online planning via gradient-based trajectory optimization, as well as offline policy optimization. Our framework offers new opportunities for harnessing and exploiting rich knowledge about the world, and takes a key step toward building machines with more human-like representations of the world.

deep learning, graph, neural network, (19 more...)

arXiv.org Machine Learning

1806.01242

Country:

Europe > United Kingdom (0.14)
Europe > Sweden (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Energy > Oil & Gas (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Reinforcement and Imitation Learning for Diverse Visuomotor Skills

Zhu, Yuke, Wang, Ziyu, Merel, Josh, Rusu, Andrei, Erez, Tom, Cabi, Serkan, Tunyasuvunakool, Saran, Kramár, János, Hadsell, Raia, de Freitas, Nando, Heess, Nicolas

arXiv.org Artificial IntelligenceFeb-26-2018

We propose a model-free deep reinforcement learning method that leverages a small amount of demonstration data to assist a reinforcement learning agent. We apply this approach to robotic manipulation tasks and train end-to-end visuomotor policies that map directly from RGB camera inputs to joint velocities. We demonstrate that our approach can solve a wide variety of visuomotor tasks, for which engineering a scripted controller would be laborious. Our experiments indicate that our reinforcement and imitation agent achieves significantly better performances than agents trained with reinforcement learning or imitation learning alone. We also illustrate that these policies, trained with large visual and dynamics variations, can achieve preliminary successes in zero-shot sim2real transfer. A brief visual description of this work can be viewed in https://youtu.be/EDl8SQUNjj0

computer game, deep learning, demonstration, (21 more...)

arXiv.org Artificial Intelligence

1802.09564

Country: North America > United States > New York (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Neuroprosthetic decoder training as imitation learning

Merel, Josh, Carlson, David, Paninski, Liam, Cunningham, John P.

arXiv.org Machine LearningMar-14-2016

Neuroprosthetic brain-computer interfaces function via an algorithm which decodes neural activity of the user into movements of an end effector, such as a cursor or robotic arm. In practice, the decoder is often learned by updating its parameters while the user performs a task. When the user's intention is not directly observable, recent methods have demonstrated value in training the decoder against a surrogate for the user's intended movement. We describe how training a decoder in this way is a novel variant of an imitation learning problem, where an oracle or expert is employed for supervised training in lieu of direct observations, which are not available. Specifically, we describe how a generic imitation learning meta-algorithm, dataset aggregation (DAgger, [1]), can be adapted to train a generic brain-computer interface. By deriving existing learning algorithms for brain-computer interfaces in this framework, we provide a novel analysis of regret (an important metric of learning efficacy) for brain-computer interfaces. This analysis allows us to characterize the space of algorithmic variants and bounds on their regret rates. Existing approaches for decoder learning have been performed in the cursor control setting, but the available design principles for these decoders are such that it has been impossible to scale them to naturalistic settings. Leveraging our findings, we then offer an algorithm that combines imitation learning with optimal control, which should allow for training of arbitrary effectors for which optimal control can generate goal-oriented control. We demonstrate this novel and general BCI algorithm with simulated neuroprosthetic control of a 26 degree-of-freedom model of an arm, a sophisticated and realistic end effector.

decoder, neural network, optimization problem, (21 more...)

arXiv.org Machine Learning

doi: 10.1371/journal.pcbi.1004948

1511.04156

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback