AITopics | Undersander, Eric

Collaborating Authors

Undersander, Eric

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks

Chang, Matthew, Chhablani, Gunjan, Clegg, Alexander, Cote, Mikael Dallaire, Desai, Ruta, Hlavac, Michal, Karashchuk, Vladimir, Krantz, Jacob, Mottaghi, Roozbeh, Parashar, Priyam, Patki, Siddharth, Prasad, Ishita, Puig, Xavier, Rai, Akshara, Ramrakhya, Ram, Tran, Daniel, Truong, Joanne, Turner, John M., Undersander, Eric, Yang, Tsung-Yen

arXiv.org Artificial IntelligenceOct-31-2024

We present a benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration (PARTNR) designed to study human-robot coordination in household activities. PARTNR tasks exhibit characteristics of everyday tasks, such as spatial, temporal, and heterogeneous agent capability constraints. We employ a semi-automated task generation pipeline using Large Language Models (LLMs), incorporating simulation in the loop for grounding and verification. PARTNR stands as the largest benchmark of its kind, comprising 100,000 natural language tasks, spanning 60 houses and 5,819 unique objects. We analyze state-of-the-art LLMs on PARTNR tasks, across the axes of planning, perception and skill execution. The analysis reveals significant limitations in SoTA models, such as poor coordination and failures in task tracking and recovery from errors. When LLMs are paired with real humans, they require 1.5x as many steps as two humans collaborating and 1.1x more steps than a single human, underscoring the potential for improvement in these models. We further show that fine-tuning smaller LLMs with planning data can achieve performance on par with models 9 times larger, while being 8.6x faster at inference. Overall, PARTNR highlights significant challenges facing collaborative embodied agents and aims to drive research in this direction.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.00081

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ASC: Adaptive Skill Coordination for Robotic Mobile Manipulation

Yokoyama, Naoki, Clegg, Alex, Truong, Joanne, Undersander, Eric, Yang, Tsung-Yen, Arnaud, Sergio, Ha, Sehoon, Batra, Dhruv, Rai, Akshara

arXiv.org Artificial IntelligenceNov-19-2023

We present Adaptive Skill Coordination (ASC) -- an approach for accomplishing long-horizon tasks like mobile pick-and-place (i.e., navigating to an object, picking it, navigating to another location, and placing it). ASC consists of three components -- (1) a library of basic visuomotor skills (navigation, pick, place), (2) a skill coordination policy that chooses which skill to use when, and (3) a corrective policy that adapts pre-trained skills in out-of-distribution states. All components of ASC rely only on onboard visual and proprioceptive sensing, without requiring detailed maps with obstacle layouts or precise object locations, easing real-world deployment. We train ASC in simulated indoor environments, and deploy it zero-shot (without any real-world experience or fine-tuning) on the Boston Dynamics Spot robot in eight novel real-world environments (one apartment, one lab, two microkitchens, two lounges, one office space, one outdoor courtyard). In rigorous quantitative comparisons in two environments, ASC achieves near-perfect performance (59/60 episodes, or 98%), while sequentially executing skills succeeds in only 44/60 (73%) episodes. Extensive perturbation experiments show that ASC is robust to hand-off errors, changes in the environment layout, dynamic obstacles (e.g., people), and unexpected disturbances. Supplementary videos at adaptiveskillcoordination.github.io.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.0041

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots

Puig, Xavier, Undersander, Eric, Szot, Andrew, Cote, Mikael Dallaire, Yang, Tsung-Yen, Partsey, Ruslan, Desai, Ruta, Clegg, Alexander William, Hlavac, Michal, Min, So Yeon, Vondruš, Vladimír, Gervet, Theophile, Berges, Vincent-Pierre, Turner, John M., Maksymets, Oleksandr, Kira, Zsolt, Kalakrishnan, Mrinal, Malik, Jitendra, Chaplot, Devendra Singh, Jain, Unnat, Batra, Dhruv, Rai, Akshara, Mottaghi, Roozbeh

arXiv.org Artificial IntelligenceOct-19-2023

We present Habitat 3.0: a simulation platform for studying collaborative human-robot tasks in home environments. Habitat 3.0 offers contributions across three dimensions: (1) Accurate humanoid simulation: addressing challenges in modeling complex deformable bodies and diversity in appearance and motion, all while ensuring high simulation speed. (2) Human-in-the-loop infrastructure: enabling real human interaction with simulated robots via mouse/keyboard or a VR interface, facilitating evaluation of robot policies with human input. (3) Collaborative tasks: studying two collaborative tasks, Social Navigation and Social Rearrangement. Social Navigation investigates a robot's ability to locate and follow humanoid avatars in unseen environments, whereas Social Rearrangement addresses collaboration between a humanoid and robot while rearranging a scene. These contributions allow us to study end-to-end learned and heuristic baselines for human-robot collaboration in-depth, as well as evaluate them with humans in the loop. Our experiments demonstrate that learned robot policies lead to efficient task completion when collaborating with unseen humanoid agents and human partners that might exhibit behaviors that the robot has not seen before. Additionally, we observe emergent behaviors during collaborative task execution, such as the robot yielding space when obstructing a humanoid agent, thereby allowing the effective completion of the task by the humanoid agent. Furthermore, our experiments using the human-in-the-loop tool demonstrate that our automated evaluation with humanoids can provide an indication of the relative ordering of different policies when evaluated with real human collaborators. Habitat 3.0 unlocks interesting new features in simulators for Embodied AI, and we hope it paves the way for a new frontier of embodied human-AI interaction capabilities.

artificial intelligence, avatar and robot, co-habitat, (2 more...)

arXiv.org Artificial Intelligence

2310.13724

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-Second

Berges, Vincent-Pierre, Szot, Andrew, Chaplot, Devendra Singh, Gokaslan, Aaron, Mottaghi, Roozbeh, Batra, Dhruv, Undersander, Eric

arXiv.org Artificial IntelligenceJun-13-2023

We present Galactic, a large-scale simulation and reinforcement-learning (RL) framework for robotic mobile manipulation in indoor environments. Specifically, a Fetch robot (equipped with a mobile base, 7DoF arm, RGBD camera, egomotion, and onboard sensing) is spawned in a home environment and asked to rearrange objects - by navigating to an object, picking it up, navigating to a target location, and then placing the object at the target location. Galactic is fast. In terms of simulation speed (rendering + physics), Galactic achieves over 421,000 steps-per-second (SPS) on an 8-GPU node, which is 54x faster than Habitat 2.0 (7699 SPS). More importantly, Galactic was designed to optimize the entire rendering + physics + RL interplay since any bottleneck in the interplay slows down training. In terms of simulation+RL speed (rendering + physics + inference + learning), Galactic achieves over 108,000 SPS, which 88x faster than Habitat 2.0 (1243 SPS). These massive speed-ups not only drastically cut the wall-clock training time of existing experiments, but also unlock an unprecedented scale of new experiments. First, Galactic can train a mobile pick skill to >80% accuracy in under 16 minutes, a 100x speedup compared to the over 24 hours it takes to train the same skill in Habitat 2.0. Second, we use Galactic to perform the largest-scale experiment to date for rearrangement using 5B steps of experience in 46 hours, which is equivalent to 20 years of robot experience. This scaling results in a single neural network composed of task-agnostic components achieving 85% success in GeometricGoal rearrangement, compared to 0% success reported in Habitat 2.0 for the same approach. The code is available at github.com/facebookresearch/galactic.

galactic, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2306.07552

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.67)

Add feedback

Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI

Ramakrishnan, Santhosh K., Gokaslan, Aaron, Wijmans, Erik, Maksymets, Oleksandr, Clegg, Alex, Turner, John, Undersander, Eric, Galuba, Wojciech, Westbury, Andrew, Chang, Angel X., Savva, Manolis, Zhao, Yili, Batra, Dhruv

arXiv.org Artificial IntelligenceSep-16-2021

We present the Habitat-Matterport 3D (HM3D) dataset. HM3D is a large-scale dataset of 1,000 building-scale 3D reconstructions from a diverse set of real-world locations. Each scene in the dataset consists of a textured 3D mesh reconstruction of interiors such as multi-floor residences, stores, and other private indoor spaces. HM3D surpasses existing datasets available for academic research in terms of physical scale, completeness of the reconstruction, and visual fidelity. HM3D contains 112.5k m^2 of navigable space, which is 1.4 - 3.7x larger than other building-scale datasets such as MP3D and Gibson. When compared to existing photorealistic 3D datasets such as Replica, MP3D, Gibson, and ScanNet, images rendered from HM3D have 20 - 85% higher visual fidelity w.r.t. counterpart images captured with real cameras, and HM3D meshes have 34 - 91% fewer artifacts due to incomplete surface reconstruction. The increased scale, fidelity, and diversity of HM3D directly impacts the performance of embodied AI agents trained using it. In fact, we find that HM3D is `pareto optimal' in the following sense -- agents trained to perform PointGoal navigation on HM3D achieve the highest performance regardless of whether they are evaluated on HM3D, Gibson, or MP3D. No similar claim can be made about training on other datasets. HM3D-trained PointNav agents achieve 100% performance on Gibson-test dataset, suggesting that it might be time to retire that episode dataset.

artificial intelligence, dataset, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2109.08238

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.89)

Add feedback

Block-Sparse Recurrent Neural Networks

Narang, Sharan, Undersander, Eric, Diamos, Gregory

arXiv.org Machine LearningNov-7-2017

Recurrent Neural Networks (RNNs) are used in state-of-the-art models in domains such as speech recognition, machine translation, and language modelling. Sparsity is a technique to reduce compute and memory requirements of deep learning models. Sparse RNNs are easier to deploy on devices and high-end server processors. Even though sparse operations need less compute and memory relative to their dense counterparts, the speed-up observed by using sparse operations is less than expected on different hardware platforms. In order to address this issue, we investigate two different approaches to induce block sparsity in RNNs: pruning blocks of weights in a layer and using group lasso regularization to create blocks of weights with zeros. Using these techniques, we demonstrate that we can create block-sparse RNNs with sparsity ranging from 80% to 90% with small loss in accuracy. This allows us to reduce the model size by roughly 10x. Additionally, we can prune a larger dense network to recover this loss in accuracy while maintaining high block sparsity and reducing the overall parameter count. Our technique works with a variety of block sizes up to 32x32. Block-sparse RNNs eliminate overheads related to data storage and irregular memory accesses while increasing hardware efficiency compared to unstructured sparsity.

deep learning, neural network, sparsity, (18 more...)

arXiv.org Machine Learning

1711.02782

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.84)

Industry: Energy > Oil & Gas (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback