Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement

Katsumata, Kei, Kambara, Motonari, Yashima, Daichi, Korekata, Ryosuke, Sugiura, Komei

Jan-28-2025–arXiv.org Artificial Intelligence

Abstract-- We consider the problem of generating free-form mobile manipulation instructions based on a target object image and receptacle image. Conventional image captioning models are not able to generate appropriate instructions because their architectures are typically optimized for single-image. In this study, we propose a model that handles both the target object and receptacle to generate free-form instruction sentences for mobile manipulation tasks. Moreover, we introduce a novel training method that effectively incorporates the scores from both learning-based and n-gram based automatic evaluation metrics as rewards. This method enables the model to learn the co-occurrence relationships between words and appropriate paraphrases. Therefore, models are required to appropriately handle both images. Hence, these methods are inappropriate essential in a variety of contexts such as elderly care facilities for generating mobile manipulation instructions based on and daily support for disabilities. In particular, the integration multiple images. of service robots in elderly care facilities significantly We propose a model that generates mobile manipulation reduces the burden on caregivers and addresses the growing instructions using a target object image and a receptacle demand driven by the rise in the elderly population.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jan-28-2025

arXiv.org PDF

Add feedback

Country:
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)

Genre:
- Research Report > New Finding (0.66)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Natural Language > Large Language Model (0.70)
  - Machine Learning > Neural Networks (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found