Goto

Collaborating Authors

 assembly process


IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos

Neural Information Processing Systems

While significant progress has been made in developing autonomous agents for shape assembly, existing datasets have not yet tackled the 4D grounding of assembly instructions in videos, essential for a holistic understanding of assembly in 3D space over time.


IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos

Neural Information Processing Systems

While significant progress has been made in developing autonomous agents for shape assembly, existing datasets have not yet tackled the 4D grounding of assembly instructions in videos, essential for a holistic understanding of assembly in 3D space over time.



A robust and compliant robotic assembly control strategy for batch precision assembly task with uncertain fit types and fit amounts

arXiv.org Artificial Intelligence

In some high -precision industrial applications, robots are deployed to perform precision assembly tasks on mass batches of manufactured pegs and holes. If the peg and hole are designed with transition fit, machining errors may lead to either a clearance or an interference fit for a specific pair of components, with uncertain fit amounts. This paper focuses on the robotic batch precision assembly task involving components with uncertain fit types and fit amounts, and proposes an efficient methodology to construct the robust and compliant assembly control strategy. Specifically, t he batch precision assembly task is decomposed into multiple deterministic subtasks, and a force -vision fusion controller -driven reinforcement learnin g method and a m ulti-task reinforcement learning training method (FVFC -MTRL) are proposed to jointly learn multiple compliance control strategies for these subtasks. Subsequently, the multi-teacher policy distillation approach is designed to integrate multiple trained strategies into a unified student network, thereby establishing a robust control strategy. Real -world experiment s demonstrate that the proposed method successfully constructs the robust control strategy for high -precision assembly task with different fit types and fit amounts. With the development of intelligent manufacturing, deploying robots to replace manual operations in peg -in-hole assembly tasks has greatly enhanced production efficiency and product quality [1]. In the batch assembly process of 3C (computers, communication and consumer electronics) products, h igh-precision assembly tasks are more challenging as the mating components to be assembled are in tight clearance fits (0.01 mm) or even small interference fit s [2 ]. In high-precision industrial scenarios, the fit types of numerous mating components are designed as transition fits, such as mobile phone lenses. To enhance assembly efficiency, the geometric dimensions of the mating components are typically not individually measured prior to the assembly.


Multi-Robot Assembly of Deformable Linear Objects Using Multi-Modal Perception

arXiv.org Artificial Intelligence

The handling robot on the left picks one DLO from a bin full of DLO instances and hands it to one of the mounting robots on the right. The two mounting robots then collaboratively mount the DLO onto designated fixtures. The DLO's status is monitored by RGB-D cameras, F/T and ViTac sensors throughout the process. Abstract -- Industrial assembly of deformable linear objects (DLOs) such as cables offers great potential for many industries. However, DLOs pose several challenges for robot-based automation due to the inherent complexity of deformation and, consequentially, the difficulties in anticipating the behavior of DLOs in dynamic situations. Although existing studies have addressed isolated subproblems like shape tracking, grasping, and shape control, there has been limited exploration of integrated workflows that combine these individual processes. T o address this gap, we propose an object-centric perception and planning framework to achieve a comprehensive DLO assembly process throughout the industrial value chain. The framework utilizes visual and tactile information to track the DLO's shape as well as contact state across different stages, which facilitates effective planning of robot actions. Our approach encompasses robot-based bin picking of DLOs from cluttered environments, followed by a coordinated handover to two additional robots that mount the DLOs onto designated fixtures. Real-world experiments employing a setup with multiple robots demonstrate the effectiveness of the approach and its relevance to industrial scenarios.


GAIPAT -Dataset on Human Gaze and Actions for Intent Prediction in Assembly Tasks

arXiv.org Artificial Intelligence

The primary objective of the dataset is to provide a better understanding of the coupling between human actions and gaze in a shared working environment with a cobot, with the aim of signifcantly enhancing the effciency and safety of humancobot interactions. More broadly, by linking gaze patterns with physical actions, the dataset offers valuable insights into cognitive processes and attention dynamics in the context of assembly tasks. The proposed dataset contains gaze and action data from approximately 80 participants, recorded during simulated industrial assembly tasks. The tasks were simulated using controlled scenarios in which participants manipulated educational building blocks. Gaze data was collected using two different eye-tracking setups -head-mounted and remote-while participants worked in two positions: sitting and standing.


Kaiwu: A Multimodal Manipulation Dataset and Framework for Robot Learning and Human-Robot Interaction

arXiv.org Artificial Intelligence

Cutting-edge robot learning techniques including foundation models and imitation learning from humans all pose huge demands on large-scale and high-quality datasets which constitute one of the bottleneck in the general intelligent robot fields. This paper presents the Kaiwu multimodal dataset to address the missing real-world synchronized multimodal data problems in the sophisticated assembling scenario,especially with dynamics information and its fine-grained labelling. The dataset first provides an integration of human,environment and robot data collection framework with 20 subjects and 30 interaction objects resulting in totally 11,664 instances of integrated actions. For each of the demonstration,hand motions,operation pressures,sounds of the assembling process,multi-view videos, high-precision motion capture information,eye gaze with first-person videos,electromyography signals are all recorded. Fine-grained multi-level annotation based on absolute timestamp,and semantic segmentation labelling are performed. Kaiwu dataset aims to facilitate robot learning,dexterous manipulation,human intention investigation and human-robot collaboration research.


IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos

arXiv.org Artificial Intelligence

Shape assembly is a ubiquitous task in daily life, integral for constructing complex 3D structures like IKEA furniture. While significant progress has been made in developing autonomous agents for shape assembly, existing datasets have not yet tackled the 4D grounding of assembly instructions in videos, essential for a holistic understanding of assembly in 3D space over time. We introduce IKEA Video Manuals, a dataset that features 3D models of furniture parts, instructional manuals, assembly videos from the Internet, and most importantly, annotations of dense spatio-temporal alignments between these data modalities. To demonstrate the utility of IKEA Video Manuals, we present five applications essential for shape assembly: assembly plan generation, part-conditioned segmentation, part-conditioned pose estimation, video object segmentation, and furniture assembly based on instructional video manuals. For each application, we provide evaluation metrics and baseline methods. Through experiments on our annotated data, we highlight many challenges in grounding assembly instructions in videos to improve shape assembly, including handling occlusions, varying viewpoints, and extended assembly sequences.


The best robot kits for kids in 2024

Popular Science

We may earn revenue from the products available on this page and participate in affiliate programs. Building a robot at home is more than just a fun activity--it's a hands-on way to explore the exciting world of STEM [Science, Technology, Engineering, and Math]. Whether you're searching for a children's toy robot to inspire curiosity or a more advanced robot-building kit for older kids or teens, like our best overall Sillbird STEM 12-in-1 Education Solar Robot Toy, the best robot kits offer options for all ages and skill levels. Robot building kits offer a perfect blend of creativity and learning, teaching essential skills like coding, problem-solving, and engineering through play. From preschool-friendly robot toys to beginner robotics kits for older children, these sets provide a fantastic introduction to the basics of robotics.


D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models

arXiv.org Artificial Intelligence

Collaborative robots are increasingly popular for assisting humans at work and daily tasks. However, designing and setting up interfaces for human-robot collaboration is challenging, requiring the integration of multiple components, from perception and robot task control to the hardware itself. Frequently, this leads to highly customized solutions that rely on large amounts of costly training data, diverging from the ideal of flexible and general interfaces that empower robots to perceive and adapt to unstructured environments where they can naturally collaborate with humans. To overcome these challenges, this paper presents the Detection-Robot Management GPT (D-RMGPT), a robot-assisted assembly planner based on Large Multimodal Models (LMM). This system can assist inexperienced operators in assembly tasks without requiring any markers or previous training. D-RMGPT is composed of DetGPT-V and R-ManGPT. DetGPT-V, based on GPT-4V(vision), perceives the surrounding environment through one-shot analysis of prompted images of the current assembly stage and the list of components to be assembled. It identifies which components have already been assembled by analysing their features and assembly requirements. R-ManGPT, based on GPT-4, plans the next component to be assembled and generates the robot's discrete actions to deliver it to the human co-worker. Experimental tests on assembling a toy aircraft demonstrated that D-RMGPT is flexible and intuitive to use, achieving an assembly success rate of 83% while reducing the assembly time for inexperienced operators by 33% compared to the manual process. http://robotics-and-ai.github.io/LMMmodels/