AITopics | Blukis, Valts

Plotting

Blukis, Valts

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RVT: Robotic View Transformer for 3D Object Manipulation

Goyal, Ankit, Xu, Jie, Guo, Yijie, Blukis, Valts, Chao, Yu-Wei, Fox, Dieter

arXiv.org Artificial IntelligenceJun-26-2023

For 3D object manipulation, methods that build an explicit 3D representation perform better than those relying only on camera images. But using explicit 3D representations like voxels comes at large computing cost, adversely affecting scalability. In this work, we propose RVT, a multi-view transformer for 3D manipulation that is both scalable and accurate. Some key features of RVT are an attention mechanism to aggregate information across views and re-rendering of the camera input from virtual views around the robot workspace. In simulations, we find that a single RVT model works well across 18 RLBench tasks with 249 task variations, achieving 26% higher relative success than the existing state-of-the-art method (PerAct). It also trains 36X faster than PerAct for achieving the same performance and achieves 2.3X the inference speed of PerAct. Further, RVT can perform a variety of manipulation tasks in the real world with just a few ($\sim$10) demonstrations per task. Visual results, code, and trained model are provided at https://robotic-view-transformer.github.io/.

artificial intelligence, machine learning, orth, (13 more...)

arXiv.org Artificial Intelligence

2306.14896

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

Wen, Bowen, Tremblay, Jonathan, Blukis, Valts, Tyree, Stephen, Muller, Thomas, Evans, Alex, Fox, Dieter, Kautz, Jan, Birchfield, Stan

arXiv.org Artificial IntelligenceMar-24-2023

We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence, while simultaneously performing neural 3D reconstruction of the object. Our method works for arbitrary rigid objects, even when visual texture is largely absent. The object is assumed to be segmented in the first frame only. No additional information is required, and no assumption is made about the interaction agent. Key to our method is a Neural Object Field that is learned concurrently with a pose graph optimization process in order to robustly accumulate information into a consistent 3D representation capturing both geometry and appearance. A dynamic pool of posed memory frames is automatically maintained to facilitate communication between these threads. Our approach handles challenging sequences with large pose changes, partial and full occlusion, untextured surfaces, and specular highlights. We show results on HO3D, YCBInEOAT, and BEHAVE datasets, demonstrating that our method significantly outperforms existing approaches. Project page: https://bundlesdf.github.io

add-s, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2303.14158

Country: Europe > Netherlands (0.14)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution

Blukis, Valts, Paxton, Chris, Fox, Dieter, Garg, Animesh, Artzi, Yoav

arXiv.org Artificial IntelligenceJul-12-2021

Natural language provides an accessible and expressive interface to specify long-term tasks for robotic agents. However, non-experts are likely to specify such tasks with high-level instructions, which abstract over specific robot actions through several layers of abstraction. We propose that key to bridging this gap between language and robot actions over long execution horizons are persistent representations. We propose a persistent spatial semantic representation method, and show how it enables building an agent that performs hierarchical reasoning to effectively execute long-term tasks. We evaluate our approach on the ALFRED benchmark and achieve state-of-the-art results, despite completely avoiding the commonly used step-by-step instructions.

instruction, neural network, spatial reasoning, (22 more...)

arXiv.org Artificial Intelligence

2107.05612

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.70)
(2 more...)

Add feedback

Few-shot Object Grounding and Mapping for Natural Language Robot Instruction Following

Blukis, Valts, Knepper, Ross A., Artzi, Yoav

arXiv.org Artificial IntelligenceNov-14-2020

We study the problem of learning a robot policy to follow natural language instructions that can be easily extended to reason about new objects. We introduce a few-shot language-conditioned object grounding method trained from augmented reality data that uses exemplars to identify objects and align them to their mentions in instructions. We present a learned map representation that encodes object locations and their instructed use, and construct it from our few-shot grounding output. We integrate this mapping approach into an instruction-following policy, thereby allowing it to reason about previously unseen objects at test-time by simply adding exemplars. We evaluate on the task of learning to map raw observations and instructions to continuous control of a physical quadcopter. Our approach significantly outperforms the prior state of the art in the presence of new objects, even when the prior approach observes all objects during training.

deep learning, instruction, neural network, (20 more...)

arXiv.org Artificial Intelligence

2011.07384

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Learning to Map Natural Language Instructions to Physical Quadcopter Control using Simulated Flight

Blukis, Valts, Terme, Yannick, Niklasson, Eyvind, Knepper, Ross A., Artzi, Yoav

arXiv.org Artificial IntelligenceOct-21-2019

Abstract: We propose a joint simulation and real-world learning framework for mapping navigation instructions and raw first-person observations to continuous control. Our model estimates the need for environment exploration, predicts the likelihood of visiting environment positions during execution, and controls the agent to both explore and visit high-likelihood positions. We introduce Supervised Reinforcement Asynchronous Learning (SuReAL). Learning uses both simulation and real environments without requiring autonomous flight in the physical environment during training, and combines supervised learning for predicting positions to visit and reinforcement learning for continuous control. We evaluate our approach on a natural language instruction-following task with a physical quad-copter, and demonstrate effective execution and exploration behavior.

air transportation, deep learning, instruction, (23 more...)

arXiv.org Artificial Intelligence

1910.09664

Country: Asia > Japan (0.14)

Genre: Research Report (0.40)

Industry:

Transportation > Air (0.34)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction

Blukis, Valts, Misra, Dipendra, Knepper, Ross A., Artzi, Yoav

arXiv.org Artificial IntelligenceNov-9-2018

Executing natural language navigation instructions from raw observations requires solving language, perception, planning, and control problems. Consider instructing a quadcopter drone using natural language. Figure 1 shows an example instruction. Resolving the instruction requires identifying the blue fence, anvil and tree in the world, understanding the spatial constraints towards and on the right, planning a trajectory that satisfies these constraints, and continuously controlling the quadcopter to follow the trajectory. Existing work has addressed this problem mostly using manually-designed symbolic representations for language meaning and environment [1, 2, 3, 4, 5, 6].

deep learning, instruction, neural network, (21 more...)

arXiv.org Artificial Intelligence

1811.04179

Country:

North America > United States > New York (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.40)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Following High-level Navigation Instructions on a Simulated Quadcopter with Imitation Learning

Blukis, Valts, Brukhim, Nataly, Bennett, Andrew, Knepper, Ross A., Artzi, Yoav

arXiv.org Artificial IntelligenceMay-31-2018

We introduce a method for following high-level navigation instructions by mapping directly from images, instructions and pose estimates to continuous low-level velocity commands for real-time control. The Grounded Semantic Mapping Network (GSMN) is a fully-differentiable neural network architecture that builds an explicit semantic map in the world reference frame by incorporating a pinhole camera projection model within the network. The information stored in the map is learned from experience, while the local-to-world transformation is computed explicitly. We train the model using DAggerFM, a modified variant of DAgger that trades tabular convergence guarantees for improved training speed and memory use. We test GSMN in virtual environments on a realistic quadcopter simulator and show that incorporating an explicit mapping and grounding modules allows GSMN to outperform strong neural baselines and almost reach an expert policy performance. Finally, we analyze the learned map representations and show that using an explicit map leads to an interpretable instruction-following model.

deep learning, instruction, neural network, (18 more...)

arXiv.org Artificial Intelligence

1806.00047

Country: North America > United States > New York (0.28)

Genre:

Instructional Material (0.46)
Research Report (0.40)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback