ooi
Increasing the Task Flexibility of Heavy-Duty Manipulators Using Visual 6D Pose Estimation of Objects
Mäkinen, Petri, Mustalahti, Pauli, Kivelä, Tuomo, Mattila, Jouni
Recent advances in visual 6D pose estimation of objects using deep neural networks have enabled novel ways of vision-based control for heavy-duty robotic applications. In this study, we present a pipeline for the precise tool positioning of heavy-duty, long-reach (HDLR) manipulators using advanced machine vision. A camera is utilized in the so-called eye-in-hand configuration to estimate directly the poses of a tool and a target object of interest (OOI). Based on the pose error between the tool and the target, along with motion-based calibration between the camera and the robot, precise tool positioning can be reliably achieved using conventional robotic modeling and control methods prevalent in the industry. The proposed methodology comprises orientation and position alignment based on the visually estimated OOI poses, whereas camera-to-robot calibration is conducted based on motion utilizing visual SLAM. The methods seek to avert the inaccuracies resulting from rigid-body--based kinematics of structurally flexible HDLR manipulators via image-based algorithms. To train deep neural networks for OOI pose estimation, only synthetic data are utilized. The methods are validated in a real-world setting using an HDLR manipulator with a 5 m reach. The experimental results demonstrate that an image-based average tool positioning error of less than 2 mm along the non-depth axes is achieved, which facilitates a new way to increase the task flexibility and automation level of non-rigid HDLR manipulators.
- North America > United States (0.28)
- Europe > Finland (0.14)
- Europe > Netherlands (0.14)
IPPON: Common Sense Guided Informative Path Planning for Object Goal Navigation
Qu, Kaixian, Tan, Jie, Zhang, Tingnan, Xia, Fei, Cadena, Cesar, Hutter, Marco
Navigating efficiently to an object in an unexplored environment is a critical skill for general-purpose intelligent robots. Recent approaches to this object goal navigation problem have embraced a modular strategy, integrating classical exploration algorithms-notably frontier exploration-with a learned semantic mapping/exploration module. This paper introduces a novel informative path planning and 3D object probability mapping approach. The mapping module computes the probability of the object of interest through semantic segmentation and a Bayes filter. Additionally, it stores probabilities for common objects, which semantically guides the exploration based on common sense priors from a large language model. The planner terminates when the current viewpoint captures enough voxels identified with high confidence as the object of interest. Although our planner follows a zero-shot approach, it achieves state-of-the-art performance as measured by the Success weighted by Path Length (SPL) and Soft SPL in the Habitat ObjectNav Challenge 2023, outperforming other works by more than 20%. Furthermore, we validate its effectiveness on real robots. Project webpage: https://ippon-paper.github.io/
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States (0.14)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
UIVNAV: Underwater Information-driven Vision-based Navigation via Imitation Learning
Lin, Xiaomin, Karapetyan, Nare, Joshi, Kaustubh, Liu, Tianchen, Chopra, Nikhil, Yu, Miao, Tokekar, Pratap, Aloimonos, Yiannis
Autonomous navigation in the underwater environment is challenging due to limited visibility, dynamic changes, and the lack of a cost-efficient accurate localization system. We introduce UIVNav, a novel end-to-end underwater navigation solution designed to drive robots over Objects of Interest (OOI) while avoiding obstacles, without relying on localization. UIVNav uses imitation learning and is inspired by the navigation strategies used by human divers who do not rely on localization. UIVNav consists of the following phases: (1) generating an intermediate representation (IR), and (2) training the navigation policy based on human-labeled IR. By training the navigation policy on IR instead of raw data, the second phase is domain-invariant -- the navigation policy does not need to be retrained if the domain or the OOI changes. We show this by deploying the same navigation policy for surveying two different OOIs, oyster and rock reefs, in two different domains, simulation, and a real pool. We compared our method with complete coverage and random walk methods which showed that our method is more efficient in gathering information for OOIs while also avoiding obstacles. The results show that UIVNav chooses to visit the areas with larger area sizes of oysters or rocks with no prior information about the environment or localization. Moreover, a robot using UIVNav compared to complete coverage method surveys on average 36% more oysters when traveling the same distances. We also demonstrate the feasibility of real-time deployment of UIVNavin pool experiments with BlueROV underwater robot for surveying a bed of oyster shells.
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- North America > United States > California > Los Angeles County > Torrance (0.04)
- (3 more...)
Planning for Complex Non-prehensile Manipulation Among Movable Objects by Interleaving Multi-Agent Pathfinding and Physics-Based Simulation
Saxena, Dhruv Mauria, Likhachev, Maxim
Real-world manipulation problems in heavy clutter require robots to reason about potential contacts with objects in the environment. We focus on pick-and-place style tasks to retrieve a target object from a shelf where some `movable' objects must be rearranged in order to solve the task. In particular, our motivation is to allow the robot to reason over and consider non-prehensile rearrangement actions that lead to complex robot-object and object-object interactions where multiple objects might be moved by the robot simultaneously, and objects might tilt, lean on each other, or topple. To support this, we query a physics-based simulator to forward simulate these interaction dynamics which makes action evaluation during planning computationally very expensive. To make the planner tractable, we establish a connection between the domain of Manipulation Among Movable Objects and Multi-Agent Pathfinding that lets us decompose the problem into two phases our M4M algorithm iterates over. First we solve a multi-agent planning problem that reasons about the configurations of movable objects but does not forward simulate a physics model. Next, an arm motion planning problem is solved that uses a physics-based simulator but does not search over possible configurations of movable objects. We run simulated and real-world experiments with the PR2 robot and compare against relevant baseline algorithms. Our results highlight that M4M generates complex 3D interactions, and solves at least twice as many problems as the baselines with competitive performance.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Massachusetts > Middlesex County > Natick (0.04)
- North America > Mexico > Guanajuato (0.04)
- (4 more...)
Planning for Manipulation among Movable Objects: Deciding Which Objects Go Where, in What Order, and How
Saxena, Dhruv, Likhachev, Maxim
We are interested in pick-and-place style robot manipulation tasks in cluttered and confined 3D workspaces among movable objects that may be rearranged by the robot and may slide, tilt, lean or topple. A recently proposed algorithm, M4M, determines which objects need to be moved and where by solving a Multi-Agent Pathfinding MAPF abstraction of this problem. It then utilises a nonprehensile push planner to compute actions for how the robot might realise these rearrangements and a rigid body physics simulator to check whether the actions satisfy physics constraints encoded in the problem. However, M4M greedily commits to valid pushes found during planning, and does not reason about orderings over pushes if multiple objects need to be rearranged. Furthermore, M4M does not reason about other possible MAPF solutions that lead to different rearrangements and pushes. In this paper, we extend M4M and present Enhanced-M4M (E-M4M) -- a systematic graph search-based solver that searches over orderings of pushes for movable objects that need to be rearranged and different possible rearrangements of the scene. We introduce several algorithmic optimisations to circumvent the increased computational complexity, discuss the space of problems solvable by E-M4M and show that experimentally, both on the real robot and in simulation, it significantly outperforms the original M4M algorithm, as well as other state-of-the-art alternatives when dealing with complex scenes.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Massachusetts > Middlesex County > Natick (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
The Metaverse Data Deluge: What Can We Do About It?
Ooi, Beng Chin, Chen, Gang, Shou, Mike Zheng, Tan, Kian-Lee, Tung, Anthony, Xiao, Xiaokui, Yip, James Wei Luen, Zhang, Meihui
In the Metaverse, the physical space and the virtual space co-exist, and interact simultaneously. While the physical space is virtually enhanced with information, the virtual space is continuously refreshed with real-time, real-world information. To allow users to process and manipulate information seamlessly between the real and digital spaces, novel technologies must be developed. These include smart interfaces, new augmented realities, efficient storage and data management and dissemination techniques. In this paper, we first discuss some promising co-space applications. These applications offer opportunities that neither of the spaces can realize on its own. We then discuss challenges. Finally, we discuss and envision what are likely to be required from the database and system perspectives.
- Research Report (1.00)
- Overview > Innovation (0.48)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Information Technology > Services (1.00)
- Information Technology > Security & Privacy (1.00)
- (2 more...)
Merging human and machine intelligence - Tech News The Star Online
As technology becomes indispensable from everyday life, humanity and tech will merge to the point they are indistinguishable, predicts communication and marketing firm PHD. Its research into the forces shaping marketing's future produced Merge: The Closing Gap Between Technology And Us, a documentary and book on how technology and human evolution had progressed since the 1950s and how technological advances over the next 25 to 35 years would reshape society and the marketing industry. The research drew on experts' insights and foresights, including inventor and author Ray Kurzweil, Facebook COO Sheryl Sandberg, futurologist Dr Ian Pearson and Microsoft UK chief envisioning officer Dave Coplin. PHD Malaysia head Eileen Ooi said the year ahead would bring a host of new technologies that would challenge the industry but also present new opportunities for those willing to invest in innovation and take a bold first step. She cited four key trends: chatbots, sentient virtual personal assistants, next wave of wearables, and intelligent data layers that give you additional info when viewing things.