Goto

Collaborating Authors

 apriltag


Localising under the drape: proprioception in the era of distributed surgical robotic system

Huber, Martin, Cavalcanti, Nicola A., Davoodi, Ayoob, Li, Ruixuan, Mower, Christopher E., Carrillo, Fabio, Laux, Christoph J., Teyssere, Francois, Chandanson, Thibault, Harlé, Antoine, Saghbiny, Elie, Farshad, Mazda, Morel, Guillaume, Poorten, Emmanuel Vander, Fürnstahl, Philipp, Ourselin, Sébastien, Bergeles, Christos, Vercauteren, Tom

arXiv.org Artificial Intelligence

Despite their mechanical sophistication, surgical robots remain blind to their surroundings. This lack of spatial awareness causes collisions, system recoveries, and workflow disruptions, issues that will intensify with the introduction of distributed robots with independent interacting arms. Existing tracking systems rely on bulky infrared cameras and reflective markers, providing only limited views of the surgical scene and adding hardware burden in crowded operating rooms. We present a marker-free proprioception method that enables precise localisation of surgical robots under their sterile draping despite associated obstruction of visual cues. Our method solely relies on lightweight stereo-RGB cameras and novel transformer-based deep learning models. It builds on the largest multi-centre spatial robotic surgery dataset to date (1.4M self-annotated images from human cadaveric and preclinical in vivo studies). By tracking the entire robot and surgical scene, rather than individual markers, our approach provides a holistic view robust to occlusions, supporting surgical scene understanding and context-aware control. We demonstrate an example of potential clinical benefits during in vivo breathing compensation with access to tissue dynamics, unobservable under state of the art tracking, and accurately locate in multi-robot systems for future intelligent interaction. In addition, and compared with existing systems, our method eliminates markers and improves tracking visibility by 25%. To our knowledge, this is the first demonstration of marker-free proprioception for fully draped surgical robots, reducing setup complexity, enhancing safety, and paving the way toward modular and autonomous robotic surgery.


GPS Denied IBVS-Based Navigation and Collision Avoidance of UAV Using a Low-Cost RGB Camera

Wang, Xiaoyu, Tan, Yan Rui, Leong, William, Huang, Sunan, Teo, Rodney, Xiang, Cheng

arXiv.org Artificial Intelligence

Abstract-- This paper proposes an image-based visual ser-voing (IBVS) framework for UA V navigation and collision avoidance using only an RGB camera. While UA V navigation has been extensively studied, it remains challenging to apply IBVS in missions involving multiple visual targets and collision avoidance. The proposed method achieves navigation without explicit path planning, and collision avoidance is realized through AI-based monocular depth estimation from RGB images. Unlike approaches that rely on stereo cameras or external workstations, our framework runs fully onboard a Jetson platform, ensuring a self-contained and deployable system. Experimental results validate that the UA V can navigate across multiple AprilT ags and avoid obstacles effectively in GPS-denied environments. I. INTRODUCTION Most UA V applications depend on position estimation provided by global positioning systems (GPS). However, GPS is often unavailable in indoor, mountainous, or forest environments, motivating the use of computer vision for UA V navigation. This paper focuses on image-based visual servoing (IBVS) with an onboard RGB camera.


Measuring and Minimizing Disturbance of Marine Animals to Underwater Vehicles

Cai, Levi, Jézéquel, Youenn, Mooney, T. Aran, Girdhar, Yogesh

arXiv.org Artificial Intelligence

Do fish respond to the presence of underwater vehicles, potentially biasing our estimates about them? If so, are there strategies to measure and mitigate this response? This work provides a theoretical and practical framework towards bias-free estimation of animal behavior from underwater vehicle observations. We also provide preliminary results from the field in coral reef environments to address these questions.


Adjustable AprilTags For Identity Secured Tasks

Li, Hao

arXiv.org Artificial Intelligence

--Special tags such as AprilT ags that facilitate image processing and pattern recognition are useful in practical applications. In close and private environments, identity security is unlikely to be an issue because all involved AprilT ags can be completely regulated. However, in open and public environments, identity security is no longer an issue that can be neglected. T o handle potential harm caused by adversarial attacks, this note advocates utilization of adjustable AprilT ags instead of fixed ones. Special tags that facilitate image processing and pattern recognition are useful in practical applications.


LLM-Drone: Aerial Additive Manufacturing with Drones Planned Using Large Language Models

Raman, Akshay, Merrill, Chad, George, Abraham, Farimani, Amir Barati

arXiv.org Artificial Intelligence

Additive manufacturing (AM) has transformed the production landscape by enabling the precision creation of complex geometries. However, AM faces limitations when applied to challenging environments, such as elevated surfaces and remote locations. Aerial additive manufacturing, facilitated by drones, presents a solution to these challenges. However, despite advances in methods for the planning, control, and localization of drones, the accuracy of these methods is insufficient to run traditional feedforward extrusion-based additive manufacturing processes (such as Fused Deposition Manufacturing). Recently, the emergence of LLMs has revolutionized various fields by introducing advanced semantic reasoning and real-time planning capabilities. This paper proposes the integration of LLMs with aerial additive manufacturing to assist with the planning and execution of construction tasks, granting greater flexibility and enabling a feed-back based design and construction system. Using the semantic understanding and adaptability of LLMs, we can overcome the limitations of drone based systems by dynamically generating and adapting building plans on site, ensuring efficient and accurate construction even in constrained environments. Our system is able to design and build structures given only a semantic prompt and has shown success in understanding the spatial environment despite tight planning constraints. Our method's feedback system enables replanning using the LLM if the manufacturing process encounters unforeseen errors, without requiring complicated heuristics or evaluation functions. Combining the semantic planning with automatic error correction, our system achieved a 90% build accuracy, converting simple text prompts to build structures.


Integration of Augmented Reality and Mobile Robot Indoor SLAM for Enhanced Spatial Awareness

Friske, Michael D.

arXiv.org Artificial Intelligence

This research explores the integration of indoor Simultaneous Localization and Mapping (SLAM) with Augmented Reality (AR) to enhance situational awareness, improving safety in hazardous or emergency situations. The main contribution of this work is to enable mobile robots to provide real-time spatial perception to users who are not co-located with the robot. This is a comprehensive approach, including selecting suitable sensors for indoor SLAM, designing and building a platform, developing methods to display maps on AR devices, implementing this into software on an AR device, and improving the robustness of communication and localization between the robot and AR device in real-world testing. By taking this approach and analyzing each component of the integrated system, this paper highlights numerous areas for future research that can further advance the integration of SLAM and AR technologies. These advancements aim to significantly improve safety and efficiency during rescue operations.


STAR: Swarm Technology for Aerial Robotics Research

Chiun, Jimmy, Tan, Yan Rui, Cao, Yuhong, Tan, John, Sartoretti, Guillaume

arXiv.org Artificial Intelligence

In recent years, the field of aerial robotics has witnessed significant progress, finding applications in diverse domains, including post-disaster search and rescue operations. Despite these strides, the prohibitive acquisition costs associated with deploying physical multi-UAV systems have posed challenges, impeding their widespread utilization in research endeavors. To overcome these challenges, we present STAR (Swarm Technology for Aerial Robotics Research), a framework developed explicitly to improve the accessibility of aerial swarm research experiments. Our framework introduces a swarm architecture based on the Crazyflie, a low-cost, open-source, palm-sized aerial platform, well suited for experimental swarm algorithms. To augment cost-effectiveness and mitigate the limitations of employing low-cost robots in experiments, we propose a landmark-based localization module leveraging fiducial markers. This module, also serving as a target detection module, enhances the adaptability and versatility of the framework. Additionally, collision and obstacle avoidance are implemented through velocity obstacles. The presented work strives to bridge the gap between theoretical advances and tangible implementations, thus fostering progress in the field.


Autonomous Robot for Disaster Mapping and Victim Localization

Potter, Michael, Bhowal, Rahil, Zhao, Richard, Patel, Anuj, Cheng, Jingming

arXiv.org Artificial Intelligence

In response to the critical need for effective reconnaissance in disaster scenarios, this research article presents the design and implementation of a complete autonomous robot system using the Turtlebot3 with Robotic Operating System (ROS) Noetic. Upon deployment in closed, initially unknown environments, the system aims to generate a comprehensive map and identify any present 'victims' using AprilTags as stand-ins. We discuss our solution for search and rescue missions, while additionally exploring more advanced algorithms to improve search and rescue functionalities. We introduce a Cubature Kalman Filter to help reduce the mean squared error [m] for AprilTag localization and an information-theoretic exploration algorithm to expedite exploration in unknown environments. Just like turtles, our system takes it slow and steady, but when it's time to save the day, it moves at ninja-like speed! Despite Donatello's shell, he's no slowpoke - he zips through obstacles with the agility of a teenage mutant ninja turtle. So, hang on tight to your shells and get ready for a whirlwind of reconnaissance! Full pipeline code https://github.com/rzhao5659/MRProject/tree/main Exploration code https://github.com/rzhao5659/MRProject/tree/main


Design of a Miniature Underwater Vehicle and Data Collection System for Indoor Experimentation

Herbert, Jacob, Wolek, Artur

arXiv.org Artificial Intelligence

This paper describes the design of a miniature uncrewed underwater vehicle (MiniUUV) and related instrumentation for indoor experimentation. The MiniUUV was developed using 3D printed components and low-cost, off-the-shelf electronics. The vehicle uses a propeller differential propulsion drive and a peristaltic pump with a syringe for buoyancy control. A water tank with an overhead camera system was constructed to allow for convenient indoor data collection in a controlled environment. Several tests were conducted to demonstrate the capabilities of the MiniUUV and data collection system, including buoyancy pump actuation tests and straight line, circular, and zig-zag motion tests on the surface. During each planar motion test an AprilTag was attached to the MiniUUV and an overhead camera system obtained video recordings that were processed offline to estimate vehicle position, surge velocity, sway velocity, yaw angle, and yaw rate.


CylinderTag: An Accurate and Flexible Marker for Cylinder-Shape Objects Pose Estimation Based on Projective Invariants

Wang, Shaoan, Zhu, Mingzhu, Hu, Yaoqing, Li, Dongyue, Yuan, Fusong, Yu, Junzhi

arXiv.org Artificial Intelligence

High-precision pose estimation based on visual markers has been a thriving research topic in the field of computer vision. However, the suitability of traditional flat markers on curved objects is limited due to the diverse shapes of curved surfaces, which hinders the development of high-precision pose estimation for curved objects. Therefore, this paper proposes a novel visual marker called CylinderTag, which is designed for developable curved surfaces such as cylindrical surfaces. CylinderTag is a cyclic marker that can be firmly attached to objects with a cylindrical shape. Leveraging the manifold assumption, the cross-ratio in projective invariance is utilized for encoding in the direction of zero curvature on the surface. Additionally, to facilitate the usage of CylinderTag, we propose a heuristic search-based marker generator and a high-performance recognizer as well. Moreover, an all-encompassing evaluation of CylinderTag properties is conducted by means of extensive experimentation, covering detection rate, detection speed, dictionary size, localization jitter, and pose estimation accuracy. CylinderTag showcases superior detection performance from varying view angles in comparison to traditional visual markers, accompanied by higher localization accuracy. Furthermore, CylinderTag boasts real-time detection capability and an extensive marker dictionary, offering enhanced versatility and practicality in a wide range of applications. Experimental results demonstrate that the CylinderTag is a highly promising visual marker for use on cylindrical-like surfaces, thus offering important guidance for future research on high-precision visual localization of cylinder-shaped objects. The code is available at: https://github.com/wsakobe/CylinderTag.