AITopics | Escontrela, Alejandro

Collaborating Authors

Escontrela, Alejandro

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning a Diffusion Model Policy from Rewards via Q-Score Matching

Psenka, Michael, Escontrela, Alejandro, Abbeel, Pieter, Ma, Yi

arXiv.org Artificial IntelligenceDec-18-2023

Diffusion models have become a popular choice for representing actor policies in behavior cloning and offline reinforcement learning. This is due to their natural ability to optimize an expressive class of distributions over a continuous space. However, previous works fail to exploit the score-based structure of diffusion models, and instead utilize a simple behavior cloning term to train the actor, limiting their ability in the actor-critic setting. In this paper, we focus on off-policy reinforcement learning and propose a new method for learning a diffusion model policy that exploits the linked structure between the score of the policy and the action gradient of the Q-function. We denote this method Q-score matching and provide theoretical justification for this approach. We conduct experiments in simulated environments to demonstrate the effectiveness of our proposed method and compare to popular baselines.

diffusion model, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2312.11752

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Video Prediction Models as Rewards for Reinforcement Learning

Escontrela, Alejandro, Adeniji, Ademi, Yan, Wilson, Jain, Ajay, Peng, Xue Bin, Goldberg, Ken, Lee, Youngwoon, Hafner, Danijar, Abbeel, Pieter

arXiv.org Artificial IntelligenceMay-30-2023

Specifying reward signals that allow agents to learn complex behaviors is a long-standing challenge in reinforcement learning. A promising approach is to extract preferences for behaviors from unlabeled videos, which are widely available on the internet. We present Video Prediction Rewards (VIPER), an algorithm that leverages pretrained video prediction models as action-free reward signals for reinforcement learning. Specifically, we first train an autoregressive transformer on expert videos and then use the video prediction likelihoods as reward signals for a reinforcement learning agent. VIPER enables expert-level control without programmatic task rewards across a wide range of DMC, Atari, and RLBench tasks. Moreover, generalization of the video prediction model allows us to derive rewards for an out-of-distribution environment where no expert data is available, enabling cross-embodiment generalization for tabletop manipulation. We see our work as starting point for scalable reward specification from unlabeled videos that will benefit from the rapid advances in generative modeling. Source code and datasets are available on the project website: https://escontrela.me/viper

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2305.14343

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Barkour: Benchmarking Animal-level Agility with Quadruped Robots

Caluwaerts, Ken, Iscen, Atil, Kew, J. Chase, Yu, Wenhao, Zhang, Tingnan, Freeman, Daniel, Lee, Kuang-Huei, Lee, Lisa, Saliceti, Stefano, Zhuang, Vincent, Batchelor, Nathan, Bohez, Steven, Casarini, Federico, Chen, Jose Enrique, Cortes, Omar, Coumans, Erwin, Dostmohamed, Adil, Dulac-Arnold, Gabriel, Escontrela, Alejandro, Frey, Erik, Hafner, Roland, Jain, Deepali, Jyenis, Bauyrjan, Kuang, Yuheng, Lee, Edward, Luu, Linda, Nachum, Ofir, Oslund, Ken, Powell, Jason, Reyes, Diego, Romano, Francesco, Sadeghi, Feresteh, Sloat, Ron, Tabanpour, Baruch, Zheng, Daniel, Neunert, Michael, Hadsell, Raia, Heess, Nicolas, Nori, Francesco, Seto, Jeff, Parada, Carolina, Sindhwani, Vikas, Vanhoucke, Vincent, Tan, Jie

arXiv.org Artificial IntelligenceMay-23-2023

Abstract--Animals have evolved various agile locomotion strategies, such as sprinting, leaping, and jumping. There is a growing interest in developing legged robots that move like their biological counterparts and show various agile skills to navigate complex environments quickly. Despite the interest, the field lacks systematic benchmarks to measure the performance of control policies and hardware in agility. We introduce the Barkour benchmark, an obstacle course to quantify agility for legged robots. Inspired by dog agility competitions, it consists of diverse obstacles and a time based scoring mechanism. This encourages researchers to develop controllers that not only move fast, but do so in a controllable and versatile way. To set strong baselines, we present two methods for tackling the benchmark. In the first approach, we train specialist locomotion skills using on-policy reinforcement learning methods and combine them with a highlevel navigation controller. In the second approach, we distill the specialist skills into a Transformer-based generalist locomotion policy, named Locomotion-Transformer, that can handle various terrains and adjust the robot's gait based on the perceived There has been a proliferation of legged robot development inspired by animal mobility. An important research question in this field is how to develop a controller that enables legged robots to exhibit animal-level agility while also being able to generalize environments, such as up and down stairs, through bushes, across various obstacles and terrains. Through the exploration and over unpaved roads and rocky or even sandy beaches. of both learning and traditional control-based methods, there Despite advances in robot hardware and control, a major has been significant progress in enabling robots to walk across challenge in the field is the lack of standardized and intuitive a wide range of terrains [10, 21, 20, 1, 27]. These robots are methods for evaluating the effectiveness of locomotion now capable of walking in a variety of indoor and outdoor controllers.

artificial intelligence, machine learning, robot, (17 more...)

arXiv.org Artificial Intelligence

2305.14654

Genre:

Research Report > New Finding (0.87)
Instructional Material > Course Syllabus & Notes (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback