Goto

Collaborating Authors

 functional correspondence


Adapting by Analogy: OOD Generalization of Visuomotor Policies via Functional Correspondence

Gupta, Pranay, Admoni, Henny, Bajcsy, Andrea

arXiv.org Artificial Intelligence

End-to-end visuomotor policies trained using behavior cloning have shown a remarkable ability to generate complex, multi-modal low-level robot behaviors. However, at deployment time, these policies still struggle to act reliably when faced with out-of-distribution (OOD) visuals induced by objects, backgrounds, or environment changes. Prior works in interactive imitation learning solicit corrective expert demonstrations under the OOD conditions -- but this can be costly and inefficient. We observe that task success under OOD conditions does not always warrant novel robot behaviors. In-distribution (ID) behaviors can directly be transferred to OOD conditions that share functional similarities with ID conditions. For example, behaviors trained to interact with in-distribution (ID) pens can apply to interacting with a visually-OOD pencil. The key challenge lies in disambiguating which ID observations functionally correspond to the OOD observation for the task at hand. We propose that an expert can provide this OOD-to-ID functional correspondence. Thus, instead of collecting new demonstrations and re-training at every OOD encounter, our method: (1) detects the need for feedback by first checking if current observations are OOD and then identifying whether the most similar training observations show divergent behaviors, (2) solicits functional correspondence feedback to disambiguate between those behaviors, and (3) intervenes on the OOD observations with the functionally corresponding ID observations to perform deployment-time generalization. We validate our method across diverse real-world robotic manipulation tasks with a Franka Panda robotic manipulator. Our results show that test-time functional correspondences can improve the generalization of a vision-based diffusion policy to OOD objects and environment conditions with low feedback.


Composable Part-Based Manipulation

Liu, Weiyu, Mao, Jiayuan, Hsu, Joy, Hermans, Tucker, Garg, Animesh, Wu, Jiajun

arXiv.org Artificial Intelligence

Compositionality provides appealing benefits in robotic manipulation, as it enables efficient learning, reasoning, and planning. Prior works have extensively studied the decomposition of scenes into objects and their relationships [1, 2, 3], as well as the division of long-horizon plans into primitive skills [3, 4], in order to navigate complex environments and devise long-horizon plans. In this paper, we present a different view of compositionality by considering object-part decomposition based on functionality (e.g., rim, handle, body), and leverage such decomposition to improve the learning of geometric and physical relationships for robot manipulation. In the context of language descriptions of objects, part names not only describe the geometric shapes of the parts but also capture their functional affordances. For instance, as depicted in Figure 1, for the action of "pouring", the rims define the boundary for alignment between the objects, the body of the pouring vessel should be tilted for the action, and its handle provides a constraint on the direction the object should face when pouring. Leveraging this knowledge of part affordances, we posit that a family of functional actions, such as pouring and constrained placing, can be conceptualized as a combination of functional correspondences between object parts.