AITopics | Mordatch, Igor

Plotting

Mordatch, Igor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Neural MMO Platform for Massively Multiagent Research

Suarez, Joseph, Du, Yilun, Zhu, Clare, Mordatch, Igor, Isola, Phillip

arXiv.org Artificial IntelligenceOct-14-2021

Neural MMO is a computationally accessible research platform that combines large agent populations, long time horizons, open-ended tasks, and modular game systems. Existing environments feature subsets of these properties, but Neural MMO is the first to combine them all. We present Neural MMO as free and open source software with active support, ongoing development, documentation, and additional training, logging, and visualization tools to help users adapt to this new setting. Initial baselines on the platform demonstrate that agents trained in large populations explore more and learn a progression of skills. We raise other more difficult problems such as many-team cooperation as open research questions which Neural MMO is well-suited to answer. Finally, we discuss current limitations of the platform, potential mitigations, and plans for continued development.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2110.07594

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.48)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Model-Based Reinforcement Learning via Latent-Space Collocation

Rybkin, Oleh, Zhu, Chuning, Nagabandi, Anusha, Daniilidis, Kostas, Mordatch, Igor, Levine, Sergey

arXiv.org Artificial IntelligenceAug-7-2021

The ability to plan into the future while utilizing only raw high-dimensional observations, such as images, can provide autonomous agents with broad capabilities. Visual model-based reinforcement learning (RL) methods that plan future actions directly have shown impressive results on tasks that require only short-horizon reasoning, however, these methods struggle on temporally extended tasks. We argue that it is easier to solve long-horizon tasks by planning sequences of states rather than just actions, as the effects of actions greatly compound over time and are harder to optimize. To achieve this, we draw on the idea of collocation, which has shown good results on long-horizon tasks in optimal control literature, and adapt it to the image-based setting by utilizing learned latent state space models. The resulting latent collocation method (LatCo) optimizes trajectories of latent states, which improves over previously proposed shooting methods for visual model-based RL on tasks with sparse rewards and long-term goals. Videos and code at https://orybkin.github.io/latco/.

deep learning, neural network, optimization, (18 more...)

arXiv.org Artificial Intelligence

2106.13229

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Education (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

Leibo, Joel Z., Duéñez-Guzmán, Edgar, Vezhnevets, Alexander Sasha, Agapiou, John P., Sunehag, Peter, Koster, Raphael, Matyas, Jayd, Beattie, Charles, Mordatch, Igor, Graepel, Thore

arXiv.org Artificial IntelligenceJul-14-2021

Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks). Our contribution, Melting Pot, is a MARL evaluation suite that fills this gap, and uses reinforcement learning to reduce the human labor required to create novel test scenarios. This works because one agent's behavior constitutes (part of) another agent's environment. To demonstrate scalability, we have created over 80 unique test scenarios covering a broad range of research topics such as social dilemmas, reciprocity, resource sharing, and task partitioning. We apply these test scenarios to standard MARL training algorithms, and demonstrate how Melting Pot reveals weaknesses not apparent from training performance alone.

agent, computer game, deep learning, (19 more...)

arXiv.org Artificial Intelligence

2107.06857

Country: Europe (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.45)

Add feedback

Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation

Freeman, C. Daniel, Frey, Erik, Raichuk, Anton, Girgin, Sertan, Mordatch, Igor, Bachem, Olivier

arXiv.org Artificial IntelligenceJun-24-2021

We present Brax, an open source library for rigid body simulation with a focus on performance and parallelism on accelerators, written in JAX. We present results on a suite of tasks inspired by the existing reinforcement learning literature, but remade in our engine. Additionally, we provide reimplementations of PPO, SAC, ES, and direct policy optimization in JAX that compile alongside our environments, allowing the learning algorithm and the environment processing to occur on the same device, and to scale seamlessly on accelerators. Finally, we include notebooks that facilitate training of performant policies on common OpenAI Gym MuJoColike tasks in minutes. Figure 1: The suite of examples environments included in the initial release of Brax. From left to right: ant, fetch, grasp, halfcheetah, and humanoid.

artificial intelligence, brax, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2106.13281

Genre: Research Report (0.40)

Industry:

Education (0.48)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Pretrained Transformers as Universal Computation Engines

Lu, Kevin, Grover, Aditya, Abbeel, Pieter, Mordatch, Igor

arXiv.org Artificial IntelligenceMar-9-2021

We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning -- in particular, without finetuning of the self-attention and feedforward layers of the residual blocks. We consider such a model, which we call a Frozen Pretrained Transformer (FPT), and study finetuning it on a variety of sequence classification tasks spanning numerical computation, vision, and protein fold prediction. In contrast to prior works which investigate finetuning on the same modality as the pretraining dataset, we show that pretraining on natural language improves performance and compute efficiency on non-language downstream tasks. In particular, we find that such pretraining enables FPT to generalize in zero-shot to these modalities, matching the performance of a transformer fully trained on these tasks.

deep learning, neural network, transformer, (21 more...)

arXiv.org Artificial Intelligence

2103.05247

Country:

Europe (0.46)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Reset-Free Lifelong Learning with Skill-Space Planning

Lu, Kevin, Grover, Aditya, Abbeel, Pieter, Mordatch, Igor

arXiv.org Artificial IntelligenceDec-7-2020

The objective of lifelong reinforcement learning (RL) is to optimize agents which can continuously adapt and interact in changing environments. However, current RL approaches fail drastically when environments are non-stationary and interactions are non-episodic. We propose Lifelong Skill Planning (LiSP), an algorithmic framework for non-episodic lifelong RL based on planning in an abstract space of higher-order skills. We learn the skills in an unsupervised manner using intrinsic rewards and plan over the learned skills using a learned dynamics model. Moreover, our framework permits skill discovery even from offline data, thereby reducing the need for excessive real-world interactions. We demonstrate empirically that LiSP successfully enables long-horizon planning and learns agents that can avoid catastrophic failures even in challenging non-stationary and non-episodic environments derived from gridworld and MuJoCo benchmarks.

agent, artificial intelligence, educational setting, (19 more...)

arXiv.org Artificial Intelligence

2012.03548

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Education > Educational Setting > Continuing Education (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Rearrangement: A Challenge for Embodied AI

Batra, Dhruv, Chang, Angel X., Chernova, Sonia, Davison, Andrew J., Deng, Jia, Koltun, Vladlen, Levine, Sergey, Malik, Jitendra, Mordatch, Igor, Mottaghi, Roozbeh, Savva, Manolis, Su, Hao

arXiv.org Artificial IntelligenceNov-3-2020

We describe a framework for research and evaluation in Embodied AI. Our proposal is based on a canonical task: Rearrangement. A standard task can focus the development of new techniques and serve as a source of trained models that can be transferred to other settings. In the rearrangement task, the goal is to bring a given physical environment into a specified state. The goal state can be specified by object poses, by images, by a description in language, or by letting the agent experience the environment in the goal state. We characterize rearrangement scenarios along different axes and describe metrics for benchmarking rearrangement performance. To facilitate research and exploration, we present experimental testbeds of rearrangement scenarios in four different simulation environments. We anticipate that other datasets will be released and new simulation platforms will be built to support training of rearrangement agents and their deployment on physical systems.

agent, computer game, neural network, (23 more...)

arXiv.org Artificial Intelligence

2011.01975

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment > Sports (0.46)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
(6 more...)

Add feedback

$\gamma$-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction

Janner, Michael, Mordatch, Igor, Levine, Sergey

arXiv.org Artificial IntelligenceOct-27-2020

We introduce the $\gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon. Replacing standard single-step models with $\gamma$-models leads to generalizations of the procedures that form the foundation of model-based control, including the model rollout and model-based value estimation. The $\gamma$-model, trained with a generative reinterpretation of temporal difference learning, is a natural continuous analogue of the successor representation and a hybrid between model-free and model-based mechanisms. Like a value function, it contains information about the long-term future; like a standard predictive model, it is independent of task reward. We instantiate the $\gamma$-model as both a generative adversarial network and normalizing flow, discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors, and empirically investigate its utility for prediction and control.

artificial intelligence, neural information processing system, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2010.14496

Country: North America (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control

Huang, Wenlong, Mordatch, Igor, Pathak, Deepak

arXiv.org Machine LearningJul-9-2020

Reinforcement learning is typically concerned with learning control policies tailored to a particular agent. We investigate whether there exists a single global policy that can generalize to control a wide variety of agent morphologies -- ones in which even dimensionality of state and action spaces changes. We propose to express this global policy as a collection of identical modular neural networks, dubbed as Shared Modular Policies (SMP), that correspond to each of the agent's actuators. Every module is only responsible for controlling its corresponding actuator and receives information from only its local sensors. In addition, messages are passed between modules, propagating information between distant modules. We show that a single modular policy can successfully generate locomotion behaviors for several planar agents with different skeletal structures such as monopod hoppers, quadrupeds, bipeds, and generalize to variants not seen during training -- a process that would normally require training and manual hyperparameter tuning for each morphology. We observe that a wide variety of drastically diverse locomotion styles across morphologies as well as centralized coordination emerges via message passing between decentralized modules purely from the reinforcement learning objective. Videos and code at https://huangwl18.github.io/modular-rl/

morphology, neural network, neurology, (19 more...)

arXiv.org Machine Learning

2007.04976

Country: Europe > Austria (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Emergent Tool Use From Multi-Agent Autocurricula

Baker, Bowen, Kanitscheider, Ingmar, Markov, Todor, Wu, Yi, Powell, Glenn, McGrew, Bob, Mordatch, Igor

arXiv.org Artificial IntelligenceSep-16-2019

Through multi-agent competition, the simple objective of hide-and-seek, and standard reinforcement learning algorithms at scale, we find that agents create a self-supervised autocurriculum inducing multiple distinct rounds of emergent strategy, many of which require sophisticated tool use and coordination. We find clear evidence of six emergent phases in agent strategy in our environment, each of which creates a new pressure for the opposing team to adapt; for instance, agents learn to build multi-object shelters using moveable boxes which in turn leads to agents discovering that they can overcome obstacles using ramps. We further provide evidence that multi-agent competition may scale better with increasing environment complexity and leads to behavior that centers around far more human-relevant skills than other self-supervised reinforcement learning methods such as intrinsic motivation. Finally, we propose transfer and fine-tuning as a way to quantitatively evaluate targeted capabilities, and we compare hide-and-seek agents to both intrinsic motivation and random initialization baselines in a suite of domain-specific intelligence tests.

agent, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

1909.07528

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Games (1.00)
Education (1.00)
Commercial Services & Supplies > Security & Alarm Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback