Goto

Collaborating Authors

 obstacle detection


AI-Enabled Capabilities to Facilitate Next-Generation Rover Surface Operations

arXiv.org Artificial Intelligence

Contemporary Mars rovers such as Curiosity and Perseverance operate at average speeds on the order of 4.2 cm/s, with daily traverses typically below 100 m [1]. These constraints stem from conservative operational approaches necessitated by communication delays, irreplaceable hardware, and limited onboard processing capabilities. The traditional Sense-Model-Plan-Act (SMPA) paradigm requires frequent stops for terrain analysis, preventing continuous motion and severely limiting mission scope and scientific return. Missions requiring long-range access to diverse geological targets (sample-return campaigns) are particularly affected by these mobility constraints [2]. Recent advances in computer vision (CV) algorithms, compact ML models, and space-qualified computing platforms offer a practical path to maintaining safety while increasing autonomy and traverse speeds. In this work, we present a set of AI-enabled systems developed under ESA contracts RAPID, FASTNAV, ViBEKO and AIAXR, and CISRU. These systems were validated in Mars-and Lunar-analogue field trials and demonstrate substantial improvements in mobility and perception accuracy. The contributions presented in this work are: (1) a far-obstacle detection component which facilitates continuous motion at speeds in excess of 1.0 m/s; (2) a coordination framework enabling multi-robot human-robot workflows for resource extraction and handling; and (3) a suite of terrain classification models for operations.


GPS Denied IBVS-Based Navigation and Collision Avoidance of UAV Using a Low-Cost RGB Camera

arXiv.org Artificial Intelligence

Abstract-- This paper proposes an image-based visual ser-voing (IBVS) framework for UA V navigation and collision avoidance using only an RGB camera. While UA V navigation has been extensively studied, it remains challenging to apply IBVS in missions involving multiple visual targets and collision avoidance. The proposed method achieves navigation without explicit path planning, and collision avoidance is realized through AI-based monocular depth estimation from RGB images. Unlike approaches that rely on stereo cameras or external workstations, our framework runs fully onboard a Jetson platform, ensuring a self-contained and deployable system. Experimental results validate that the UA V can navigate across multiple AprilT ags and avoid obstacles effectively in GPS-denied environments. I. INTRODUCTION Most UA V applications depend on position estimation provided by global positioning systems (GPS). However, GPS is often unavailable in indoor, mountainous, or forest environments, motivating the use of computer vision for UA V navigation. This paper focuses on image-based visual servoing (IBVS) with an onboard RGB camera.


Novel Pigeon-inspired 3D Obstacle Detection and Avoidance Maneuver for Multi-UAV Systems

arXiv.org Artificial Intelligence

-- Recent advances in multi - agent systems manipulation have demonstrated a rising demand for the implementation of multi - UAV systems in urban areas, which are always subjected to the presence of static and dynamic obstacles. Inspired by the collective behavior of tilapia fish and pigeons, the focus of the presented research is on the introduction of a nature - inspired collision - free formation control for a multi - UAV system, considering the obstacle avoidance maneuvers. The developed framework in this study utilizes a semi - distributed control approach, in which, based on a probabilistic Lloyd's algorithm, a centralized guidance algorithm works for optimal positioning of the UAVs, while a distributed control approach has been used for the intervehicle collision and obstacle avoidance. Further, the presented framework has been extended to the 3D space with a novel definition of 3D maneuvers. Collision Avoidance, Centroidal Voronoi Tessellation, Distributed Control, Formation Control, Multi - Agent System, Obstacle Avoidance . From an engineering perspective, swarm intelligence shows how decentralized systems, composed of numerous simple agents, can achieve complex collective behaviors.


AI and Vision based Autonomous Navigation of Nano-Drones in Partially-Known Environments

arXiv.org Artificial Intelligence

--The miniaturisation of sensors and processors, the advancements in connected edge intelligence, and the exponential interest in Artificial Intelligence are boosting the affirmation of autonomous nano-size drones in the Internet of Robotic Things ecosystem. However, achieving safe autonomous navigation and high-level tasks such as exploration and surveillance with these tiny platforms is extremely challenging due to their limited resources. This work focuses on enabling the safe and autonomous flight of a pocket-size, 30-gram platform called Crazyflie 2.1 in a partially known environment. We propose a novel AIaided, vision-based reactive planning method for obstacle avoidance under the ambit of Integrated Sensing, Computing and Communication paradigm. We deal with the constraints of the nano-drone by splitting the navigation task into two parts: a deep learning-based object detector runs on the edge (external hardware) while the planning algorithm is executed onboard. The results show the ability to command the drone at 8 frames-per-second and a model performance reaching a COCO mean-average-precision of 60. 8 . Field experiments demonstrate the feasibility of the solution with the drone flying at a top speed of 1m/s while steering away from an obstacle placed in an unknown position and reaching the target destination. The outcome highlights the compatibility of the communication delay and the model performance with the requirements of the real-time navigation task. We provide a feasible alternative to a fully onboard implementation that can be extended to autonomous exploration with nano-drones. The Internet of Robotic Things (IoRT) is an emerging Internet of Things paradigm where robots are provided advanced situational awareness thanks to sensors and data analytics methods implemented onboard and on the edge [1].


Towards Automated Safety Requirements Derivation Using Agent-based RAG

arXiv.org Artificial Intelligence

We study the automated derivation of safety requirements in a self-driving vehicle use case, leveraging LLMs in combination with agent-based retrieval-augmented generation. Conventional approaches that utilise pre-trained LLMs to assist in safety analyses typically lack domain-specific knowledge. Existing RAG approaches address this issue, yet their performance deteriorates when handling complex queries and it becomes increasingly harder to retrieve the most relevant information. This is particularly relevant for safety-relevant applications. In this paper, we propose the use of agent-based RAG to derive safety requirements and show that the retrieved information is more relevant to the queries. We implement an agent-based approach on a document pool of automotive standards and the Apollo case study, as a representative example of an automated driving perception system. Our solution is tested on a data set of safety requirement questions and answers, extracted from the Apollo data. Evaluating a set of selected RAG metrics, we present and discuss advantages of a agent-based approach compared to default RAG methods.


LLM-Glasses: GenAI-driven Glasses with Haptic Feedback for Navigation of Visually Impaired People

arXiv.org Artificial Intelligence

Abstract-- We present LLM-Glasses, a wearable navigation system designed to assist visually impaired individuals by combining haptic feedback, YOLO-World object detection, and GPT-4o-driven reasoning. The system delivers real-time tactile guidance via temple-mounted actuators, enabling intuitive and independent navigation. Three user studies were conducted to evaluate its effectiveness: (1) a haptic pattern recognition study achieving an 81.3% average recognition rate across 13 distinct patterns, (2) a VICON-based navigation study in which participants successfully followed predefined paths in open spaces, and (3) an LLM-guided video evaluation demonstrating 91.8% accuracy in open scenarios, 84.6% with static obstacles, and 81.5% with dynamic obstacles. These results demonstrate the system's reliability in controlled environments, with ongoing work focusing on refining its responsiveness and adaptability to diverse real-world scenarios. LLM-Glasses showcases the potential of combining generative AI with haptic interfaces to empower visually impaired individuals with intuitive and effective mobility solutions.


LV-DOT: LiDAR-visual dynamic obstacle detection and tracking for autonomous robot navigation

arXiv.org Artificial Intelligence

Accurate perception of dynamic obstacles is essential for autonomous robot navigation in indoor environments. Although sophisticated 3D object detection and tracking methods have been investigated and developed thoroughly in the fields of computer vision and autonomous driving, their demands on expensive and high-accuracy sensor setups and substantial computational resources from large neural networks make them unsuitable for indoor robotics. Recently, more lightweight perception algorithms leveraging onboard cameras or LiDAR sensors have emerged as promising alternatives. However, relying on a single sensor poses significant limitations: cameras have limited fields of view and can suffer from high noise, whereas LiDAR sensors operate at lower frequencies and lack the richness of visual features. To address this limitation, we propose a dynamic obstacle detection and tracking framework that uses both onboard camera and LiDAR data to enable lightweight and accurate perception. Our proposed method expands on our previous ensemble detection approach, which integrates outputs from multiple low-accuracy but computationally efficient detectors to ensure real-time performance on the onboard computer. In this work, we propose a more robust fusion strategy that integrates both LiDAR and visual data to enhance detection accuracy further. We then utilize a tracking module that adopts feature-based object association and the Kalman filter to track and estimate detected obstacles' states. Besides, a dynamic obstacle classification algorithm is designed to robustly identify moving objects. The dataset evaluation demonstrates a better perception performance compared to benchmark methods. The physical experiments on a quadcopter robot confirms the feasibility for real-world navigation.


DJI Flip review: A unique and useful creator drone with a few flaws

Engadget

After creating a stir with the 200 Neo, DJI is back at it with another innovative drone, the Flip. It has a first-of-a-kind folding design and shrouded propellers to keep people safe. It also integrates 3D infrared obstacle detection to track subjects and has a long list of impressive features. With a camera borrowed from the Mini 4 Pro, the Flip can take high-quality 4K 60p video indoors or out with little risk. It comes with vlogger-friendly features like Direction Track and Quickshots for social media.


DJI Air 3S review: LiDAR and improved image quality make for a nearly faultless drone

Engadget

DJI just announced the dual-camera Air 3S drone and there's some all-new cutting-edge tech hiding in the nose. A LiDAR sensor is there to provide extra crash protection at night, a time that's often dangerous for drones. The Air 3S also has a new main camera with a larger sensor better suited for capturing video in low-light. And it now comes with the company's ActiveTrack 360, which it first introduced in the Mini 4 Pro, allowing the device to zoom all around your subject while tracking and filming them. There are a bunch of other little improvements, from storage to the new panoramic photo mode, all at the same 1,099 price as the Air 3 was at launch.


Innovative Deep Learning Techniques for Obstacle Recognition: A Comparative Study of Modern Detection Algorithms

arXiv.org Artificial Intelligence

YOLOv8: Integrated advanced loss functions and feature fusion methods for superior accuracy. Obstacle detection is critical in autonomous systems, smart surveillance, and industrial automation. YOLO (You Only Look Once) speed and adaptability in real-time scenarios. The advent of models, from YOLOv5 to the latest YOLOv8, have pushed the deep learning, particularly CNNs, significantly improved boundaries of speed and accuracy, making them ideal for detection accuracy and efficiency. YOLO models have been a applications that demand quick and reliable detection in cornerstone in this evolution, with each version bringing dynamic environments.