Simultaneous Pick and Place Detection by Combining SE(3) Diffusion Models with Differential Kinematics
Ko, Tianyi, Ikeda, Takuya, Opra, Balazs, Nishiwaki, Koichi
–arXiv.org Artificial Intelligence
-- Grasp detection methods typically target the detection of a set of free-floating hand poses that can grasp the object. However, not all of the detected grasp poses are executable due to physical constraints. Even though it is straightforward to filter invalid grasp poses in the post-process, such a two-staged approach is computationally inefficient, especially when the constraint is hard. In this work, we propose an approach to take the following two constraints into account during the grasp detection stage, namely, (i) the picked object must be able to be placed with a predefined configuration without in-hand manipulation (ii) it must be reachable by the robot under the joint limit and collision-avoidance constraints for both pick and place cases. Our key idea is to train an SE(3) grasp diffusion network to estimate the noise in the form of spatial velocity, and constrain the denoising process by a multi-target differential inverse kinematics with an inequality constraint, so that the states are guaranteed to be reachable and placement can be performed without collision. In addition to an improved success ratio, we experimentally confirmed that our approach is more efficient and consistent in computation time compared to a naive two-stage approach. Pick-and-place is one of the most fundamental applications of robots. Despite the significant number of works on generating "pick" poses, limited works focus on simultaneously considering both picking and placing. A single robot arm with a simple hand often leaves no margin for in-hand manipulation or handover capability.
arXiv.org Artificial Intelligence
Aug-6-2025
- Country:
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Japan
- Genre:
- Research Report (0.40)
- Technology: