AITopics | Saito, Daichi

Collaborating Authors

Saito, Daichi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Modality-Driven Design for Multi-Step Dexterous Manipulation: Insights from Neuroscience

Wake, Naoki, Kanehira, Atsushi, Saito, Daichi, Takamatsu, Jun, Sasabuchi, Kazuhiro, Koike, Hideki, Ikeuchi, Katsushi

arXiv.org Artificial IntelligenceDec-15-2024

Multi-step dexterous manipulation is a fundamental skill in household scenarios, yet remains an underexplored area in robotics. This paper proposes a modular approach, where each step of the manipulation process is addressed with dedicated policies based on effective modality input, rather than relying on a single end-to-end model. To demonstrate this, a dexterous robotic hand performs a manipulation task involving picking up and rotating a box. Guided by insights from neuroscience, the task is decomposed into three sub-skills, 1)reaching, 2)grasping and lifting, and 3)in-hand rotation, based on the dominant sensory modalities employed in the human brain. Each sub-skill is addressed using distinct methods from a practical perspective: a classical controller, a Vision-Language-Action model, and a reinforcement learning policy with force feedback, respectively. We tested the pipeline on a real robot to demonstrate the feasibility of our approach. The key contribution of this study lies in presenting a neuroscience-inspired, modality-driven methodology for multi-step dexterous manipulation.

artificial intelligence, dexterous manipulation, manipulation, (14 more...)

arXiv.org Artificial Intelligence

2412.11337

Country: Asia > Japan (0.47)

Genre:

Research Report (1.00)
Workflow (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)

Add feedback

Task Success Prediction for Open-Vocabulary Manipulation Based on Multi-Level Aligned Representations

Goko, Miyu, Kambara, Motonari, Saito, Daichi, Otsuki, Seitaro, Sugiura, Komei

arXiv.org Artificial IntelligenceOct-1-2024

In this study, we consider the problem of predicting task success for open-vocabulary manipulation by a manipulator, based on instruction sentences and egocentric images before and after manipulation. Conventional approaches, including multimodal large language models (MLLMs), often fail to appropriately understand detailed characteristics of objects and/or subtle changes in the position of objects. We propose Contrastive $\lambda$-Repformer, which predicts task success for table-top manipulation tasks by aligning images with instruction sentences. Our method integrates the following three key types of features into a multi-level aligned representation: features that preserve local image information; features aligned with natural language; and features structured through natural language. This allows the model to focus on important changes by looking at the differences in the representation between two images. We evaluate Contrastive $\lambda$-Repformer on a dataset based on a large-scale standard dataset, the RT-1 dataset, and on a physical robot platform. The results show that our approach outperformed existing approaches including MLLMs. Our best model achieved an improvement of 8.66 points in accuracy compared to the representative MLLM-based model.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.00436

Country: Europe > Germany (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

APriCoT: Action Primitives based on Contact-state Transition for In-Hand Tool Manipulation

Saito, Daichi, Kanehira, Atsushi, Sasabuchi, Kazuhiro, Wake, Naoki, Takamatsu, Jun, Koike, Hideki, Ikeuchi, Katsushi

arXiv.org Artificial IntelligenceJul-16-2024

In-hand tool manipulation is an operation that not only manipulates a tool within the hand (i.e., in-hand manipulation) but also achieves a grasp suitable for a task after the manipulation. This study aims to achieve an in-hand tool manipulation skill through deep reinforcement learning. The difficulty of learning the skill arises because this manipulation requires (A) exploring long-term contact-state changes to achieve the desired grasp and (B) highly-varied motions depending on the contact-state transition. (A) leads to a sparsity of a reward on a successful grasp, and (B) requires an RL agent to explore widely within the state-action space to learn highly-varied actions, leading to sample inefficiency. To address these issues, this study proposes Action Primitives based on Contact-state Transition (APriCoT). APriCoT decomposes the manipulation into short-term action primitives by describing the operation as a contact-state transition based on three action representations (detach, crossover, attach). In each action primitive, fingers are required to perform short-term and similar actions. By training a policy for each primitive, we can mitigate the issues from (A) and (B). This study focuses on a fundamental operation as an example of in-hand tool manipulation: rotating an elongated object grasped with a precision grasp by half a turn to achieve the initial grasp. Experimental results demonstrated that ours succeeded in both the rotation and the achievement of the desired grasp, unlike existing studies. Additionally, it was found that the policy was robust to changes in object shape.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2407.11436

Country: Asia > Japan (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Designing Library of Skill-Agents for Hardware-Level Reusability

Takamatsu, Jun, Saito, Daichi, Ikeuchi, Katsushi, Kanehira, Atsushi, Sasabuchi, Kazuhiro, Wake, Naoki

arXiv.org Artificial IntelligenceMar-20-2024

To use new robot hardware in a new environment, it is necessary to develop a control program tailored to that specific robot in that environment. Considering the reusability of software among robots is crucial to minimize the effort involved in this process and maximize software reuse across different robots in different environments. This paper proposes a method to remedy this process by considering hardware-level reusability, using Learning-from-observation (LfO) paradigm with a pre-designed skill-agent library. The LfO framework represents the required actions in hardware-independent representations, referred to as task models, from observing human demonstrations, capturing the necessary parameters for the interaction between the environment and the robot. When executing the desired actions from the task models, a set of skill agents is employed to convert the representations into robot commands. This paper focuses on the latter part of the LfO framework, utilizing the set to generate robot actions from the task models, and explores a hardware-independent design approach for these skill agents. These skill agents are described in a hardware-independent manner, considering the relative relationship between the robot's hand position and the environment. As a result, it is possible to execute these actions on robots with different hardware configurations by simply swapping the inverse kinematics solver. This paper, first, defines a necessary and sufficient skill-agent set corresponding to cover all possible actions, and considers the design principles for these skill agents in the library. We provide concrete examples of such skill agents and demonstrate the practicality of using these skill agents by showing that the same representations can be executed on two different robots, Nextage and Fetch, using the proposed skill-agents set.

artificial intelligence, penalty, transition, (14 more...)

arXiv.org Artificial Intelligence

2403.02316

Country: Asia > Japan (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (0.67)

Add feedback

Constraint-aware Policy for Compliant Manipulation

Saito, Daichi, Sasabuchi, Kazuhiro, Wake, Naoki, Kanehira, Atsushi, Takamatsu, Jun, Koike, Hideki, Ikeuchi, Katsushi

arXiv.org Artificial IntelligenceNov-18-2023

Robot manipulation in a physically-constrained environment requires compliant manipulation. Compliant manipulation is a manipulation skill to adjust hand motion based on the force imposed by the environment. Recently, reinforcement learning (RL) has been applied to solve household operations involving compliant manipulation. However, previous RL methods have primarily focused on designing a policy for a specific operation that limits their applicability and requires separate training for every new operation. We propose a constraint-aware policy that is applicable to various unseen manipulations by grouping several manipulations together based on the type of physical constraint involved. The type of physical constraint determines the characteristic of the imposed force direction; thus, a generalized policy is trained in the environment and reward designed on the basis of this characteristic. This paper focuses on two types of physical constraints: prismatic and revolute joints. Experiments demonstrated that the same policy could successfully execute various compliant-manipulation operations, both in the simulation and reality. We believe this study is the first step toward realizing a generalized household-robot.

artificial intelligence, machine learning, manipulation, (17 more...)

arXiv.org Artificial Intelligence

2311.11007

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.69)

Add feedback

Text-driven object affordance for guiding grasp-type recognition in multimodal robot teaching

Wake, Naoki, Saito, Daichi, Sasabuchi, Kazuhiro, Koike, Hideki, Ikeuchi, Katsushi

arXiv.org Artificial IntelligenceMay-12-2023

This study investigates how text-driven object affordance, which provides prior knowledge about grasp types for each object, affects image-based grasp-type recognition in robot teaching. The researchers created labeled datasets of first-person hand images to examine the impact of object affordance on recognition performance. They evaluated scenarios with real and illusory objects, considering mixed reality teaching conditions where visual object information may be limited. The results demonstrate that object affordance improves image-based recognition by filtering out unlikely grasp types and emphasizing likely ones. The effectiveness of object affordance was more pronounced when there was a stronger bias towards specific grasp types for each object. These findings highlight the significance of object affordance in multimodal robot teaching, regardless of whether real objects are present in the images. Sample code is available on https://github.com/microsoft/arr-grasp-type-recognition.

affordance, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s00138-023-01408-z

2103.00268

Country: Asia > Japan (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.47)

Add feedback

Tracker: Model-based Reinforcement Learning for Tracking Control of Human Finger Attached with Thin McKibben Muscles

Saito, Daichi, Nagatomo, Eri, Pardomuan, Jefferson, Koike, Hideki

arXiv.org Artificial IntelligenceApr-1-2023

To adopt the soft hand exoskeleton to support activities of daily livings, it is necessary to control finger joints precisely with the exoskeleton. The problem of controlling joints to follow a given trajectory is called the tracking control problem. In this study, we focus on the tracking control problem of a human finger attached with thin McKibben muscles. To achieve precise control with thin McKibben muscles, there are two problems: one is the complex characteristics of the muscles, for example, non-linearity, hysteresis, uncertainties in the real world, and the other is the difficulty in accessing a precise model of the muscles and human fingers. To solve these problems, we adopted DreamerV2, which is a model-based reinforcement learning method, but the target trajectory cannot be generated by the learned model. Therefore, we propose Tracker, which is an extension of DreamerV2 for the tracking control problem. In the experiment, we showed that Tracker can achieve an approximately 81% smaller error than PID for the control of a two-link manipulator that imitates a part of human index finger from the metacarpal bone to the proximal bone. Tracker achieved the control of the third joint of the human index finger with a small error by being trained for approximately 60 minutes. In addition, it took approximately 15 minutes, which is less than the time required for the first training, to achieve almost the same accuracy by fine-tuning the policy pre-trained by the user's finger after taking off and attaching thin McKibben muscles again as the accuracy before taking off.

machine learning, reinforcement learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2304.00227

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Task-sequencing Simulator: Integrated Machine Learning to Execution Simulation for Robot Manipulation

Sasabuchi, Kazuhiro, Saito, Daichi, Kanehira, Atsushi, Wake, Naoki, Takamatsu, Jun, Ikeuchi, Katsushi

arXiv.org Artificial IntelligenceJan-3-2023

A task-sequencing simulator in robotics manipulation to integrate simulation-for-learning and simulation-for-execution is introduced. Unlike existing machine-learning simulation where a non-decomposed simulation is used to simulate a training scenario, the task-sequencing simulator runs a composed simulation using building blocks. This way, the simulation-for-learning is structured similarly to a multi-step simulation-for-execution. To compose both learning and execution scenarios, a unified trainable-and-composable description of blocks called a concept model is proposed and used. Using the simulator design and concept models, a reusable simulator for learning different tasks, a common-ground system for learning-to-execution, simulation-to-real is achieved and shown.

artificial intelligence, integrated machine learning, task-sequencing simulator, (2 more...)

arXiv.org Artificial Intelligence

2301.01382

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (0.40)

Add feedback