Goto

Collaborating Authors

 teleoperator


Inside China's robotics revolution

The Guardian

An engineer at the AgiBot factory in Shanghai, China, where the 5,000th mass-produced humanoid robot had rolled off the production line. An engineer at the AgiBot factory in Shanghai, China, where the 5,000th mass-produced humanoid robot had rolled off the production line. How close are we to the sci-fi vision of autonomous humanoid robots? C hen Liang, the founder of Guchi Robotics, an automation company headquartered in Shanghai, is a tall, heavy-set man in his mid-40s with square-rimmed glasses. His everyday manner is calm and understated, but when he is in his element - up close with the technology he builds, or in business meetings discussing the imminent replacement of human workers by robots - he wears an exuberant smile that brings to mind an intern on his first day at his dream job. Guchi makes the machines that install wheels, dashboards and windows for many of the top Chinese car brands, including BYD and Nio. He took the name from the Chinese word, "steadfast intelligence", though the fact that it sounded like an Italian luxury brand was not entirely unwelcome. For the better part of two decades, Chen has tried to solve what, to him, is an engineering problem: how to eliminate - or, in his view, liberate - as many workers in car factories as technologically possible. Late last year, I visited him at Guchi headquarters on the western outskirts of Shanghai. Next to the head office are several warehouses where Guchi's engineers tinker with robots to fit the specifications of their customers. Chen, an engineer by training, founded Guchi in 2019 with the aim of tackling the hardest automation task in the car factory: "final assembly", the last leg of production, when all the composite pieces - the dashboard, windows, wheels and seat cushions - come together. At present, his robots can mount wheels, dashboards and windows on to a car without any human intervention, but 80% of the final assembly, he estimates, has yet to be automated. That is what Chen has set his sights on. As in much of the world, AI has become part of everyday life in China . But what most excites Chinese politicians and industrialists are the strides being made in the field of robotics, which, when combined with advances in AI, could revolutionise the world of work.


Digital twin and extended reality for teleoperation of the electric vehicle battery disassembly

Kaarlela, Tero, Salo, Sami, Outeiro, Jose

arXiv.org Artificial Intelligence

Disassembling and sorting Electric Vehicle Batteries (EVBs) supports a sustainable transition to electric vehicles by enabling a closed-loop supply chain. Currently, the manual disassembly process exposes workers to hazards, including electrocution and toxic chemicals. We propose a teleoperated system for the safe disassembly and sorting of EVBs. A human-in-the-loop can create and save disassembly sequences for unknown EVB types, enabling future automation. An RGB camera aligns the physical and digital twins of the EVB, and the digital twin of the robot is based on the Robot Operating System (ROS) middleware. This hybrid approach combines teleoperation and automation to improve safety, adaptability, and efficiency in EVB disassembly and sorting. The economic contribution is realized by reducing labor dependency and increasing throughput in battery recycling. An online pilot study was set up to evaluate the usability of the presented approach, and the results demonstrate the potential as a user-friendly solution.


Real-time Photorealistic Mapping for Situational Awareness in Robot Teleoperation

Page, Ian, Susbielle, Pierre, Aycard, Olivier, Wieber, Pierre-Brice

arXiv.org Artificial Intelligence

-- Achieving efficient remote teleoperation is particularly challenging in unknown environments, as the teleoperator must rapidly build an understanding of the site's layout. Online 3D mapping is a proven strategy to tackle this challenge, as it enables the teleoperator to progressively explore the site from multiple perspectives. However, traditional online map-based teleoperation systems struggle to generate visually accurate 3D maps in real-time due to the high computational cost involved, leading to poor teleoperation performances. In this work, we propose a solution to improve teleoperation efficiency in unknown environments. Our approach proposes a novel, modular and efficient GPU-based integration between recent advancement in gaussian splatting SLAM and existing online map-based teleoperation systems. We compare the proposed solution against state-of-the-art teleoperation systems and validate its performances through real-world experiments using an aerial vehicle. The results show significant improvements in decision-making speed and more accurate interaction with the environment, leading to greater teleoperation efficiency. In doing so, our system enhances remote teleoperation by seamlessly integrating photorealistic mapping generation with real-time performances, enabling effective teleoperation in unfamiliar environments.


Instrumentation for Better Demonstrations: A Case Study

Proesmans, Remko, Lips, Thomas, wyffels, Francis

arXiv.org Artificial Intelligence

Learning from demonstrations is a powerful paradigm for robot manipulation, but its effectiveness hinges on both the quantity and quality of the collected data. In this work, we present a case study of how instrumentation, i.e. integration of sensors, can improve the quality of demonstrations and automate data collection. We instrument a squeeze bottle with a pressure sensor to learn a liquid dispensing task, enabling automated data collection via a PI controller. Transformer-based policies trained on automated demonstrations outperform those trained on human data in 78% of cases. Our findings indicate that instrumentation not only facilitates scalable data collection but also leads to better-performing policies, highlighting its potential in the pursuit of generalist robotic agents.


Generalizing Safety Beyond Collision-Avoidance via Latent-Space Reachability Analysis

Nakamura, Kensuke, Peters, Lasse, Bajcsy, Andrea

arXiv.org Artificial Intelligence

Hamilton-Jacobi (HJ) reachability is a rigorous mathematical framework that enables robots to simultaneously detect unsafe states and generate actions that prevent future failures. While in theory, HJ reachability can synthesize safe controllers for nonlinear systems and nonconvex constraints, in practice, it has been limited to hand-engineered collision-avoidance constraints modeled via low-dimensional state-space representations and first-principles dynamics. In this work, our goal is to generalize safe robot controllers to prevent failures that are hard -- if not impossible -- to write down by hand, but can be intuitively identified from high-dimensional observations: for example, spilling the contents of a bag. We propose Latent Safety Filters, a latent-space generalization of HJ reachability that tractably operates directly on raw observation data (e.g., RGB images) by performing safety analysis in the latent embedding space of a generative world model. This transforms nuanced constraint specification to a classification problem in latent space and enables reasoning about dynamical consequences that are hard to simulate. In simulation and hardware experiments, we use Latent Safety Filters to safeguard arbitrary policies (from generative policies to direct teleoperation) from complex safety hazards, like preventing a Franka Research 3 manipulator from spilling the contents of a bag or toppling cluttered objects.


Gaze-based Task Decomposition for Robot Manipulation in Imitation Learning

Takizawa, Ryo, Ohmura, Yoshiyuki, Kuniyoshi, Yasuo

arXiv.org Artificial Intelligence

In imitation learning for robotic manipulation, decomposing object manipulation tasks into multiple sub-tasks is essential. This decomposition enables the reuse of learned skills in varying contexts and the combination of acquired skills to perform novel tasks, rather than merely replicating demonstrated motions. Gaze plays a critical role in human object manipulation, where it is strongly correlated with hand movements. We hypothesize that an imitating agent's gaze control, fixating on specific landmarks and transitioning between them, simultaneously segments demonstrated manipulations into sub-tasks. In this study, we propose a simple yet robust task decomposition method based on gaze transitions. The method leverages teleoperation, a common modality in robotic manipulation for collecting demonstrations, in which a human operator's gaze is measured and used for task decomposition as a substitute for an imitating agent's gaze. Notably, our method achieves consistent task decomposition across all demonstrations for each task, which is desirable in contexts such as machine learning. We applied this method to demonstrations of various tasks and evaluated the characteristics and consistency of the resulting sub-tasks. Furthermore, through extensive testing across a wide range of hyperparameter variations, we demonstrated that the proposed method possesses the robustness necessary for application to different robotic systems.


Dual-Arm Telerobotic Platform for Robotic Hotbox Operations for Nuclear Waste Disposition in EM Sites

Lee, Joong-Ku, Park, Young Soo

arXiv.org Artificial Intelligence

This paper introduces a dual-arm telerobotic platform designed to efficiently and safely execute hot cell operations for nuclear waste disposition at EM sites. The proposed system consists of a remote robot arm platform and a teleoperator station, both integrated with a software architecture to control the entire system. The dual-arm configuration of the remote platform enhances versatility and task performance in complex and hazardous environments, ensuring precise manipulation and effective handling of nuclear waste materials. The integration of a teleoperator station enables human teleoperator to remotely control the entire system real-time, enhancing decision-making capabilities, situational awareness, and dexterity. The control software plays a crucial role in our system, providing a robust and intuitive interface for the teleoperator. Test operation results demonstrate the system's effectiveness in operating as a remote hotbox for nuclear waste disposition, showcasing its potential applicability in real EM sites.


Learning to Look Around: Enhancing Teleoperation and Learning with a Human-like Actuated Neck

Sen, Bipasha, Wang, Michelle, Thakur, Nandini, Agarwal, Aditya, Agrawal, Pulkit

arXiv.org Artificial Intelligence

We introduce a teleoperation system that integrates a 5 DOF actuated neck, designed to replicate natural human head movements and perception. By enabling behaviors like peeking or tilting, the system provides operators with a more intuitive and comprehensive view of the environment, improving task performance, reducing cognitive load, and facilitating complex whole-body manipulation. We demonstrate the benefits of natural perception across seven challenging teleoperation tasks, showing how the actuated neck enhances the scope and efficiency of remote operation. Furthermore, we investigate its role in training autonomous policies through imitation learning. In three distinct tasks, the actuated neck supports better spatial awareness, reduces distribution shift, and enables adaptive task-specific adjustments compared to a static wide-angle camera.


Wheeled Humanoid Bilateral Teleoperation with Position-Force Control Modes for Dynamic Loco-Manipulation

Purushottam, Amartya, Yan, Jack, Xu, Christopher, Sim, Youngwoo, Ramos, Joao

arXiv.org Artificial Intelligence

Remote-controlled humanoid robots can revolutionize manufacturing, construction, and healthcare industries by performing complex or dangerous manual tasks traditionally done by humans. We refer to these behaviors as Dynamic Loco-Manipulation (DLM). To successfully complete these tasks, humans control the position of their bodies and contact forces at their hands. To enable similar whole-body control in humanoids, we introduce loco-manipulation retargeting strategies with switched position and force control modes in a bilateral teleoperation framework. Our proposed locomotion mappings use the pitch and yaw of the operator's torso to control robot position or acceleration. The manipulation retargeting maps the operator's arm movements to the robot's arms for joint-position or impedance control of the end-effector. A Human-Machine Interface captures the teleoperator's motion and provides haptic feedback to their torso, enhancing their awareness of the robot's interactions with the environment. In this paper, we demonstrate two forms of DLM. First, we show the robot slotting heavy boxes (5-10.5 kg), weighing up to 83% of the robot's weight, into desired positions. Second, we show human-robot collaboration for carrying an object, where the robot and teleoperator take on leader and follower roles.


Vision Language Model-Empowered Contract Theory for AIGC Task Allocation in Teleoperation

Zhan, Zijun, Dong, Yaxian, Hu, Yuqing, Li, Shuai, Cao, Shaohua, Han, Zhu

arXiv.org Artificial Intelligence

Integrating low-light image enhancement techniques, in which diffusion-based AI-generated content (AIGC) models are promising, is necessary to enhance nighttime teleoperation. Remarkably, the AIGC model is computation-intensive, thus necessitating the allocation of AIGC tasks to edge servers with ample computational resources. Given the distinct cost of the AIGC model trained with varying-sized datasets and AIGC tasks possessing disparate demand, it is imperative to formulate a differential pricing strategy to optimize the utility of teleoperators and edge servers concurrently. Nonetheless, the pricing strategy formulation is under information asymmetry, i.e., the demand (e.g., the difficulty level of AIGC tasks and their distribution) of AIGC tasks is hidden information to edge servers. Additionally, manually assessing the difficulty level of AIGC tasks is tedious and unnecessary for teleoperators. To this end, we devise a framework of AIGC task allocation assisted by the Vision Language Model (VLM)-empowered contract theory, which includes two components: VLM-empowered difficulty assessment and contract theory-assisted AIGC task allocation. The first component enables automatic and accurate AIGC task difficulty assessment. The second component is capable of formulating the pricing strategy for edge servers under information asymmetry, thereby optimizing the utility of both edge servers and teleoperators. The simulation results demonstrated that our proposed framework can improve the average utility of teleoperators and edge servers by 10.88~12.43% and 1.4~2.17%, respectively. Code and data are available at https://github.com/ZiJun0819/VLM-Contract-Theory.