manipulathor
ManipulaTHOR: a framework for visual object manipulation
The Allen Institute for AI (AI2) announced the 3.0 release of its embodied artificial intelligence framework AI2-THOR, which adds active object manipulation to its testing framework. ManipulaTHOR is a first of its kind virtual agent with a highly articulated robot arm equipped with three joints of equal limb length and composed entirely of swivel joints to bring a more human-like approach to object manipulation. AI2-THOR is the first testing framework to study the problem of object manipulation in more than 100 visually rich, physics-enabled rooms. By enabling the training and evaluation of generalized capabilities in manipulation models, ManipulaTHOR allows for much faster training in more complex environments as compared to current real-world training methods, while also being far safer and more cost-effective. "Imagine a robot being able to navigate a kitchen, open a refrigerator and pull out a can of soda. This is one of the biggest and yet often overlooked challenges in robotics and AI2-THOR is the first to design a benchmark for the task of moving objects to various locations in virtual rooms, enabling reproducibility and measuring progress," said Dr. Oren Etzioni, CEO at AI2. "After five years of hard work, we can now begin to train robots to perceive and navigate the world more like we do, making real-world usage models more attainable than ever before."
ManipulaTHOR: A Framework for Visual Object Manipulation
Ehsani, Kiana, Han, Winson, Herrasti, Alvaro, VanderBilt, Eli, Weihs, Luca, Kolve, Eric, Kembhavi, Aniruddha, Mottaghi, Roozbeh
The domain of Embodied AI has recently witnessed substantial progress, particularly in navigating agents within their environments. These early successes have laid the building blocks for the community to tackle tasks that require agents to actively interact with objects in their environment. Object manipulation is an established research domain within the robotics community and poses several challenges including manipulator motion, grasping and long-horizon planning, particularly when dealing with oft-overlooked practical setups involving visually rich and complex scenes, manipulation using mobile agents (as opposed to tabletop manipulation), and generalization to unseen environments and objects. We propose a framework for object manipulation built upon the physics-enabled, visually rich AI2-THOR framework and present a new challenge to the Embodied AI community known as ArmPointNav. This task extends the popular point navigation task to object manipulation and offers new challenges including 3D obstacle avoidance, manipulating objects in the presence of occlusion, and multi-object manipulation that necessitates long term planning. Popular learning paradigms that are successful on PointNav challenges show promise, but leave a large room for improvement.