Goto

Collaborating Authors

 point-goal navigation


TTA-Nav: Test-time Adaptive Reconstruction for Point-Goal Navigation under Visual Corruptions

Piriyajitakonkij, Maytus, Sun, Mingfei, Zhang, Mengmi, Pan, Wei

arXiv.org Artificial Intelligence

Robot navigation under visual corruption presents a formidable challenge. To address this, we propose a Test-time Adaptation (TTA) method, named as TTA-Nav, for point-goal navigation under visual corruptions. Our "plug-and-play" method incorporates a top-down decoder to a pre-trained navigation model. Firstly, the pre-trained navigation model gets a corrupted image and extracts features. Secondly, the top-down decoder produces the reconstruction given the high-level features extracted by the pre-trained model. Then, it feeds the reconstruction of a corrupted image back to the pre-trained model. Finally, the pre-trained model does forward pass again to output action. Despite being trained solely on clean images, the top-down decoder can reconstruct cleaner images from corrupted ones without the need for gradient-based adaptation. The pre-trained navigation model with our top-down decoder significantly enhances navigation performance across almost all visual corruptions in our benchmarks. Our method improves the success rate of point-goal navigation from the state-of-the-art result of 46% to 94% on the most severe corruption. This suggests its potential for broader application in robotic visual navigation. Project page: https://sites.google.com/view/tta-nav


Integrating Egocentric Localization for More Realistic Point-Goal Navigation Agents

Datta, Samyak, Maksymets, Oleksandr, Hoffman, Judy, Lee, Stefan, Batra, Dhruv, Parikh, Devi

arXiv.org Artificial Intelligence

Recent work has presented embodied agents that can navigate to point-goal targets in novel indoor environments with near-perfect accuracy. However, these agents are equipped with idealized sensors for localization and take deterministic actions. This setting is practically sterile by comparison to the dirty reality of noisy sensors and actuations in the real world -- wheels can slip, motion sensors have error, actuations can rebound. In this work, we take a step towards this noisy reality, developing point-goal navigation agents that rely on visual estimates of egomotion under noisy action dynamics. We find these agents outperform naive adaptions of current point-goal agents to this setting as well as those incorporating classic localization baselines. Further, our model conceptually divides learning agent dynamics or odometry (where am I?) from task-specific navigation policy (where do I want to go?). This enables a seamless adaption to changing dynamics (a different robot or floor type) by simply re-calibrating the visual odometry model -- circumventing the expense of re-training of the navigation policy. Our agent was the runner-up in the PointNav track of CVPR 2020 Habitat Challenge.


Near-perfect point-goal navigation from 2.5 billion frames of experience

#artificialintelligence

The AI community has a long-term goal of building intelligent machines that interact effectively with the physical world, and a key challenge is teaching these systems to navigate through complex, unfamiliar real-world environments to reach a specified destination -- without a preprovided map. We are announcing today that Facebook AI has created a new large-scale distributed reinforcement learning (RL) algorithm called DD-PPO, which has effectively solved the task of point-goal navigation using only an RGB-D camera, GPS, and compass data. Agents trained with DD-PPO (which stands for decentralized distributed proximal policy optimization) achieve nearly 100 percent success in a variety of virtual environments, such as houses and office buildings. We have also successfully tested our model with tasks in real-world physical settings using a LoCoBot and Facebook AI's PyRobot platform. An unfortunate fact about maps is that they become outdated the moment they are created.


Facebook AI Researchers Achieve a 107x Speedup for Training Virtual Agents – NVIDIA Developer News Center

#artificialintelligence

Navigating a new indoor space without any prior knowledge or even a map is a challenging task for a human, let alone a robot. To help develop intelligent machines that interact more effectively with complex 3D environments, Facebook researchers developed a GPU-accelerated deep reinforcement learning model that achieves near 100 percent success in navigating a variety of virtual environments without a pre-provided map. To achieve this breakthrough, the team focused their work on developing an efficient approach to scaling RL models, which require a significant number of training samples, using multi-node distribution. "A single parameter server and thousands of (typically CPU) workers may be fundamentally incompatible with the needs of modern computer vision and robotics communities," the researchers explained in their post, Near-perfect point-goal navigation from 2.5 billion frames of experience. "Unlike Gym or Atari, 3D simulators require GPU acceleration…. The desired agents operate from high-dimensional inputs (pixels) and use deep networks, such as ResNet50, which strain the parameter server. Thus, existing distributed RL architectures do not scale and there is a need to develop a new distributed architecture."


Facebook AI gives maps the brushoff in helping robots find the way

#artificialintelligence

Facebook has scored an impressive feat involving AI that can navigate without any map. Facebook's wish for bragging rights, although they said they have a way to go, were evident in its blog post, "Near-perfect point-goal navigation from 2.5 billion frames of experience." Long story short, Facebook has delivered an algorithm that, quoting MIT Technology Review, lets robots find the shortest route in unfamiliar environments, opening the door to robots that can work inside homes and offices." And, in line with the plain-and-simple, Ubergizmo's Tyler Lee also remarked: "Facebook believes that with this new algorithm, it will be capable of creating robots that can navigate an area without the need for maps...in theory, you could place a robot in a room or an area without a map and it should be able to find its way to its destination." Erik Wijmans and Abhishek Kadian in the Facebook Jan. 21 post said that, well, after all, one of the technology key challenges is "teaching these systems to navigate through complex, unfamiliar real-world environments to reach a specified destination--without a preprovided map." Facebook has taken on the challenge. The two announced that Facebook AI created a large-scale distributed reinforcement learning algorithm called DD-PPO, "which has effectively solved the task of point-goal navigation using only an RGB-D camera, GPS, and compass data," they wrote. DD-PPO stands for decentralized distributed proximal policy optimization. This is what Facebook is using to train agents and results seen in virtual environments such as houses and office buildings were encouraging. The bloggers pointed out that "even failing 1 out of 100 times is not acceptable in the physical world, where a robot agent might damage itself or its surroundings by making an error." Beyond DD-PPO, the authors gave credit to Facebook AI's open source AI Habitat platform for its "state-of-the-art speed and fidelity." AI Habitat made its open source announcement last year as a simulation platform to train embodied agents such as virtual robots in photo-realistic 3-D environments. Facebook said it was part of "Facebook AI's ongoing effort to create systems that are less reliant on large annotated data sets used for supervised training." InfoQ had said in July that "The technology was taking a different approach than relying upon static data sets which other researchers have traditionally used and that Facebook decided to open-source this technology to move this subfield forward." Jon Fingas in Engadget looked at how the team worked toward AI navigation (and this is where that 25 billion number comes in). "Previous projects tend to struggle without massive computational power.