Toward Ownership Understanding of Objects: Active Question Generation with Large Language Model and Probabilistic Generative Model

Hashimoto, Saki, Hasegawa, Shoichi, Ishikawa, Tomochika, Taniguchi, Akira, Hagiwara, Yoshinobu, Hafi, Lotfi El, Taniguchi, Tadahiro

arXiv.org Artificial Intelligence 

Robots operating in daily life environments must understand object ownership to carry out instructions naturally given by users, such as "Bring me my cup." Without ownership knowledge, a robot cannot determine which object is being referred to when multiple similar objects exist. This problem is especially evident in kitchens, offices, or laboratories, where objects with similar appearances may belong to different individuals. Relying solely on perceptual features such as location or appearance is insufficient because ownership is inherently context-dependent and often determined by social conventions. Therefore, enabling robots to acquire ownership knowledge is a crucial step toward socially appropriate human-robot interaction. To enable robots to learn object ownership in daily life environments, it is essential to implement a question-generation mechanism that efficiently acquires necessary information. However, in real-world environments with large numbers of objects, this is impractical and imposes a heavy burden on users. Although robots can explore the environment to collect visual features of objects, it remains difficult to obtain ownership knowledge because it depends on users and context. Therefore, allowing robots to ask questions based on the current situation enables them to acquire ownership knowl-Saki Hashimoto is the presenter of this paper.