AITopics | Domae, Yukiyasu

Collaborating Authors

Domae, Yukiyasu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Bimanual Manipulation via Action Chunking and Inter-Arm Coordination with Transformers

Motoda, Tomohiro, Hanai, Ryo, Nakajo, Ryoichi, Murooka, Masaki, Erich, Floris, Domae, Yukiyasu

arXiv.org Artificial IntelligenceMar-18-2025

Robots that can operate autonomously in a human living environment are necessary to have the ability to handle various tasks flexibly. One crucial element is coordinated bimanual movements that enable functions that are difficult to perform with one hand alone. In recent years, learning-based models that focus on the possibilities of bimanual movements have been proposed. However, the high degree of freedom of the robot makes it challenging to reason about control, and the left and right robot arms need to adjust their actions depending on the situation, making it difficult to realize more dexterous tasks. To address the issue, we focus on coordination and efficiency between both arms, particularly for synchronized actions. Therefore, we propose a novel imitation learning architecture that predicts cooperative actions. We differentiate the architecture for both arms and add an intermediate encoder layer, Inter-Arm Coordinated transformer Encoder (IACE), that facilitates synchronization and temporal alignment to ensure smooth and coordinated actions. To verify the effectiveness of our architectures, we perform distinctive bimanual tasks. The experimental results showed that our model demonstrated a high success rate for comparison and suggested a suitable architecture for the policy learning of bimanual manipulation.

architecture, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2503.13916

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Japan (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.67)

Add feedback

SuctionPrompt: Visual-assisted Robotic Picking with a Suction Cup Using Vision-Language Models and Facile Hardware Design

Motoda, Tomohiro, Kitamura, Takahide, Hanai, Ryo, Domae, Yukiyasu

arXiv.org Artificial IntelligenceOct-31-2024

The development of large language models and vision-language models (VLMs) has resulted in the increasing use of robotic systems in various fields. However, the effective integration of these models into real-world robotic tasks is a key challenge. We developed a versatile robotic system called SuctionPrompt that utilizes prompting techniques of VLMs combined with 3D detections to perform product-picking tasks in diverse and dynamic environments. Our method highlights the importance of integrating 3D spatial information with adaptive action planning to enable robots to approach and manipulate objects in novel environments. In the validation experiments, the system accurately selected suction points 75.4%, and achieved a 65.0% success rate in picking common items. This study highlights the effectiveness of VLMs in robotic manipulation tasks, even with simple 3D processing.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.2364

Country: Asia > Japan (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Visual Imitation Learning of Non-Prehensile Manipulation Tasks with Dynamics-Supervised Models

Mustafa, Abdullah, Hanai, Ryo, Ramirez, Ixchel, Erich, Floris, Nakajo, Ryoichi, Domae, Yukiyasu, Ogata, Tetsuya

arXiv.org Artificial IntelligenceOct-25-2024

Unlike quasi-static robotic manipulation tasks like pick-and-place, dynamic tasks such as non-prehensile manipulation pose greater challenges, especially for vision-based control. Successful control requires the extraction of features relevant to the target task. In visual imitation learning settings, these features can be learnt by backpropagating the policy loss through the vision backbone. Yet, this approach tends to learn task-specific features with limited generalizability. Alternatively, learning world models can realize more generalizable vision backbones. Utilizing the learnt features, task-specific policies are subsequently trained. Commonly, these models are trained solely to predict the next RGB state from the current state and action taken. But only-RGB prediction might not fully-capture the task-relevant dynamics. In this work, we hypothesize that direct supervision of target dynamic states (Dynamics Mapping) can learn better dynamics-informed world models. Beside the next RGB reconstruction, the world model is also trained to directly predict position, velocity, and acceleration of environment rigid bodies. To verify our hypothesis, we designed a non-prehensile 2D environment tailored to two tasks: "Balance-Reaching" and "Bin-Dropping". When trained on the first task, dynamics mapping enhanced the task performance under different training configurations (Decoupled, Joint, End-to-End) and policy architectures (Feedforward, Recurrent). Notably, its most significant impact was for world model pretraining boosting the success rate from 21% to 85%. Although frozen dynamics-informed world models could generalize well to a task with in-domain dynamics, but poorly to a one with out-of-domain dynamics.

artificial intelligence, machine learning, world model, (18 more...)

arXiv.org Artificial Intelligence

2410.19379

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Motion Priority Optimization Framework towards Automated and Teleoperated Robot Cooperation in Industrial Recovery Scenarios

Itadera, Shunki, Domae, Yukiyasu

arXiv.org Artificial IntelligenceJan-23-2024

In this study, we introduce an optimization framework aimed at enhancing the efficiency of motion priority design in scenarios involving automated and teleoperated robots within an industrial recovery context. The escalating utilization of industrial robots at manufacturing sites has been instrumental in mitigating human workload. Nevertheless, the challenge persists in achieving effective human-robot collaboration/cooperation where human workers and robots share a workspace for collaborative tasks. In the event of an industrial robot encountering a failure, it necessitates the suspension of the corresponding factory cell for safe recovery. Given the limited capacity of pre-programmed robots to rectify such failures, human intervention becomes imperative, requiring entry into the robot workspace to address the dropped object while the robot system is halted. This non-continuous manufacturing process results in productivity loss. Robotic teleoperation has emerged as a promising technology enabling human workers to undertake high-risk tasks remotely and safely. Our study advocates for the incorporation of robotic teleoperation in the recovery process during manufacturing failure scenarios, which is referred to as "Cooperative Tele-Recovery". Our proposed approach involves the formulation of priority rules designed to facilitate collision avoidance between manufacturing and recovery robots. This, in turn, ensures a continuous manufacturing process with minimal production loss within a configurable risk limitation. We present a comprehensive motion priority optimization framework, encompassing an HRC simulator-based priority optimization and a cooperative multi-robot controller, to identify optimal parameters for the priority function. The framework dynamically adjusts the allocation of motion priorities for manufacturing and recovery robots while adhering to predefined risk limitations.

artificial intelligence, optimization problem, robot, (16 more...)

arXiv.org Artificial Intelligence

2308.15044

Country: Asia > Japan (0.14)

Genre:

Research Report > Promising Solution (0.46)
Research Report > New Finding (0.34)

Industry:

Information Technology (1.00)
Energy (0.92)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

NeuralLabeling: A versatile toolset for labeling vision datasets using Neural Radiance Fields

Erich, Floris, Chiba, Naoya, Yoshiyasu, Yusuke, Ando, Noriaki, Hanai, Ryo, Domae, Yukiyasu

arXiv.org Artificial IntelligenceSep-21-2023

Models trained using weakly supervised learning might outperform stateof-the-art models when the SOTA models are not trained on task specific data, but their performance is lower than Specialized labeling tools are essential for labeling vision SOTA models evaluated on evaluation data more similar to datasets, and both academic researchers and commercial their training data. Thus there is a need for tools that can entities have released such tools. Most existing labeling tools support large datasets creation in a time efficient manner (such as Segment Anything Labeling Tool [6] and Roboflow and low cost manner. We hope to contribute to solving [7]) use single images and therefore require significant this problem by introducing a labeling tool for computer human effort for annotating long sequences, use sequential vision datasets that uses the power of Neural Radiance data but have no geometric understanding so they cannot be Fields (NeRF) [5] for photorealistic rendering and geometric used for annotating 6DOF poses [8], or require depth data understanding. Because 3D Vision can take advantage of 3D to obtain geometric information [9, 10, 11, 12]. Our toolkit, consistency, labeled information about a single scene can be NeuralLabeling, operates on sequences of images and can applied to images from multiple viewpoints. This property thus be used to more rapidly label large datasets, and by works particularly well with photorealistic renderings such using depth reconstruction using NeRF [5] it does not rely as NeRF, where richly annotated data with many views is on input depth data.

artificial intelligence, machine learning, neurallabeling, (19 more...)

arXiv.org Artificial Intelligence

2309.11966

Country: Asia > Japan > Honshū > Tōhoku (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Learning to Dexterously Pick or Separate Tangled-Prone Objects for Industrial Bin Picking

Zhang, Xinyi, Domae, Yukiyasu, Wan, Weiwei, Harada, Kensuke

arXiv.org Artificial IntelligenceJul-7-2023

Abstract-- Industrial bin picking for tangled-prone objects requires the robot to either pick up untangled objects or perform separation manipulation when the bin contains no isolated objects. The robot must be able to flexibly perform appropriate actions based on the current observation. It is challenging due to high occlusion in the clutter, elusive entanglement phenomena, and the need for skilled manipulation planning. In this paper, we propose an autonomous, effective and general approach for picking up tangled-prone objects for industrial bin picking. First, we learn PickNet - a network that maps the visual observation to pixel-wise possibilities of picking isolated objects or separating tangled objects and infers the corresponding grasp. Then, we propose two effective separation strategies: Dropping the entangled objects into a buffer bin to reduce the degree of entanglement; Pulling to separate the entangled objects in the buffer bin planned by PullNet - a network that predicts position and direction for pulling from visual input. Other studies estimates the pose Bin picking is a valuable task in manufacturing to automate of object and evaluate the entanglement level for each object the assembly process. It deploys robots to pick [12], [13]. Such a paradigm relies on the full knowledge of necessary objects from disorganized bins, rather than relying the objects and may suffer from cumulative perception errors on human workers to arrange the objects or using a large due to heavy occlusion or self-occlusion of an individual number of part feeders. Existing studies have tackled some complex-shaped object. Other studies utilize force and torque challenges in bin picking such as planning grasps under rich sensors to classify if the robot grasps multiple objects [14].

artificial intelligence, bin, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2023.3291271

2302.08152

Country:

Asia > Japan (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

A Closed-Loop Bin Picking System for Entangled Wire Harnesses using Bimanual and Dynamic Manipulation

Zhang, Xinyi, Domae, Yukiyasu, Wan, Weiwei, Harada, Kensuke

arXiv.org Artificial IntelligenceJun-26-2023

This paper addresses the challenge of industrial bin picking using entangled wire harnesses. Wire harnesses are essential in manufacturing but poses challenges in automation due to their complex geometries and propensity for entanglement. Our previous work tackled this issue by proposing a quasi-static pulling motion to separate the entangled wire harnesses. However, it still lacks sufficiency and generalization to various shapes and structures. In this paper, we deploy a dual-arm robot that can grasp, extract and disentangle wire harnesses from dense clutter using dynamic manipulation. The robot can swing to dynamically discard the entangled objects and regrasp to adjust the undesirable grasp pose. To improve the robustness and accuracy of the system, we leverage a closed-loop framework that uses haptic feedback to detect entanglement in real-time and flexibly adjust system parameters. Our bin picking system achieves an overall success rate of 91.2% in the real-world experiments using two different types of long wire harnesses. It demonstrates the effectiveness of our system in handling various wire harnesses for industrial bin picking.

artificial intelligence, harness, wire harness, (13 more...)

arXiv.org Artificial Intelligence

2306.14595

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.64)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (0.62)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (0.88)

Add feedback

Force Map: Learning to Predict Contact Force Distribution from Vision

Hanai, Ryo, Domae, Yukiyasu, Ramirez-Alpizar, Ixchel G., Leme, Bruno, Ogata, Tetsuya

arXiv.org Artificial IntelligenceApr-12-2023

When humans see a scene, they can roughly imagine the forces applied to objects based on their experience and use them to handle the objects properly. This paper considers transferring this "force-visualization" ability to robots. We hypothesize that a rough force distribution (named "force map") can be utilized for object manipulation strategies even if accurate force estimation is impossible. Based on this hypothesis, we propose a training method to predict the force map from vision. To investigate this hypothesis, we generated scenes where objects were stacked in bulk through simulation and trained a model to predict the contact force from a single image. We further applied domain randomization to make the trained model function on real images. The experimental results showed that the model trained using only synthetic images could predict approximate patterns representing the contact areas of the objects even for real images. Then, we designed a simple algorithm to plan a lifting direction using the predicted force distribution. We confirmed that using the predicted force distribution contributes to finding natural lifting directions for typical real-world scenes. Furthermore, the evaluation through simulations showed that the disturbance caused to surrounding objects was reduced by 26 % (translation displacement) and by 39 % (angular displacement) for scenes where objects were overlapping.

artificial intelligence, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2304.05803

Genre: Research Report > New Finding (0.54)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.86)

Add feedback

Learning Efficient Policies for Picking Entangled Wire Harnesses: An Approach to Industrial Bin Picking

Zhang, Xinyi, Domae, Yukiyasu, Wan, Weiwei, Harada, Kensuke

arXiv.org Artificial IntelligenceJan-7-2023

Wire harnesses are essential connecting components in manufacturing industry but are challenging to be automated in industrial tasks such as bin picking. They are long, flexible and tend to get entangled when randomly placed in a bin. This makes it difficult for the robot to grasp a single one in dense clutter. Besides, training or collecting data in simulation is challenging due to the difficulties in modeling the combination of deformable and rigid components for wire harnesses. In this work, instead of directly lifting wire harnesses, we propose to grasp and extract the target following a circle-like trajectory until it is untangled. We learn a policy from real-world data that can infer grasps and separation actions from visual observation. Our policy enables the robot to efficiently pick and separate entangled wire harnesses by maximizing success rates and reducing execution time. To evaluate our policy, we present a set of real-world experiments on picking wire harnesses. Our policy achieves an overall 84.6% success rate compared with 49.2% in baseline. We also evaluate the effectiveness of our policy under different clutter scenarios using unseen types of wire harnesses. Results suggest that our approach is feasible for handling wire harnesses in industrial bin picking.

artificial intelligence, machine learning, wire harness, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2022.3222995

2112.05941

Country:

Asia > Japan (0.29)
North America > United States (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback