Goto

Collaborating Authors

 uav


IGUANA: Immersive Guidance, Navigation, and Control for Consumer UAV

Victor, Victor, Krisanty, Tania, McGinity, Matthew, Gumhold, Stefan, Aßmann, Uwe

arXiv.org Artificial Intelligence

As the markets for unmanned aerial vehicles (UAVs) and mixed reality (MR) headsets continue to grow, recent research has increasingly explored their integration, which enables more intuitive, immersive, and situationally aware control systems. We present IGUANA, an MR-based immersive guidance, navigation, and control system for consumer UAVs. IGUANA introduces three key elements beyond conventional control interfaces: (1) a 3D terrain map interface with draggable waypoint markers and live camera preview for high-level control, (2) a novel spatial control metaphor that uses a virtual ball as a physical analogy for low-level control, and (3) a spatial overlay that helps track the UAV when it is not visible with the naked eye or visual line of sight is interrupted. We conducted a user study to evaluate our design, both quantitatively and qualitatively, and found that (1) the 3D map interface is intuitive and easy to use, relieving users from manual control and suggesting improved accuracy and consistency with lower perceived workload relative to conventional dual-stick controller, (2) the virtual ball interface is intuitive but limited by the lack of physical feedback, and (3) the spatial overlay is very useful in enhancing the users' situational awareness.


Chat with UAV -- Human-UAV Interaction Based on Large Language Models

Wang, Haoran, Chen, Zhuohang, Li, Guang, Ma, Bo, Li, Chuanghuang

arXiv.org Artificial Intelligence

The future of UAV interaction systems is evolving from engineer-driven to user-driven, aiming to replace traditional predefined Human-UAV Interaction designs. This shift focuses on enabling more personalized task planning and design, thereby achieving a higher quality of interaction experience and greater flexibility, which can be used in many fileds, such as agriculture, aerial photography, logistics, and environmental monitoring. However, due to the lack of a common language between users and the UAVs, such interactions are often difficult to be achieved. The developments of Large Language Models possess the ability to understand nature languages and Robots' (UAVs') behaviors, marking the possibility of personalized Human-UAV Interaction. Recently, some HUI frameworks based on LLMs have been proposed, but they commonly suffer from difficulties in mixed task planning and execution, leading to low adaptability in complex scenarios. In this paper, we propose a novel dual-agent HUI framework. This framework constructs two independent LLM agents (a task planning agent, and an execution agent) and applies different Prompt Engineering to separately handle the understanding, planning, and execution of tasks. To verify the effectiveness and performance of the framework, we have built a task database covering four typical application scenarios of UAVs and quantified the performance of the HUI framework using three independent metrics. Meanwhile different LLM models are selected to control the UAVs with compared performance. Our user study experimental results demonstrate that the framework improves the smoothness of HUI and the flexibility of task execution in the tasks scenario we set up, effectively meeting users' personalized needs.


Multi-Task Bayesian Optimization for Tuning Decentralized Trajectory Generation in Multi-UAV Systems

Manzoni, Marta, Nazzari, Alessandro, Rubinacci, Roberto, Lovera, Marco

arXiv.org Artificial Intelligence

We treat each task as a trajectory generation scenario defined by a specific number of drone-to-drone interactions. To model relationships across scenarios, we employ Multi-Task Gaussian Processes, which capture shared structure across tasks and enable efficient information transfer during optimization. We compare two strategies: optimizing the average mission time across all tasks and optimizing each task individually. Through a comprehensive simulation campaign, we show that single-task optimization leads to progressively shorter mission times as swarm size grows, but requires significantly more optimization time than the average-task approach. Keywords: Multi-Task Bayesian Optimization; Gaussian Processes; Multi-agent systems; UAV; Trajectory generation 1. INTRODUCTION In recent years, research efforts and real-world applications of Unmanned Aerial Vehicles (UAVs) have increasingly shifted from single-agent to multi-agent systems.


BEDI: A Comprehensive Benchmark for Evaluating Embodied Agents on UAVs

Guo, Mingning, Wu, Mengwei, He, Jiarun, Li, Shaoxian, Li, Haifeng, Tao, Chao

arXiv.org Artificial Intelligence

With the rapid advancement of low-altitude remote sensing and Vision-Language Models (VLMs), Embodied Agents based on Unmanned Aerial Vehicles (UAVs) have shown significant potential in autonomous tasks. However, current evaluation methods for UAV-Embodied Agents (UAV-EAs) remain constrained by the lack of standardized benchmarks, diverse testing scenarios and open system interfaces. To address these challenges, we propose BEDI (Benchmark for Embodied Drone Intelligence), a systematic and standardized benchmark designed for evaluating UAV-EAs. Specifically, we introduce a novel Dynamic Chain-of-Embodied-Task paradigm based on the perception-decision-action loop, which decomposes complex UAV tasks into standardized, measurable subtasks. Building on this paradigm, we design a unified evaluation framework encompassing six core sub-skills: semantic perception, spatial perception, motion control, tool utilization, task planning and action generation. Furthermore, we develop a hybrid testing platform that incorporates a wide range of both virtual and real-world scenarios, enabling a comprehensive evaluation of UAV-EAs across diverse contexts. The platform also offers open and standardized interfaces, allowing researchers to customize tasks and extend scenarios, thereby enhancing flexibility and scalability in the evaluation process. Finally, through empirical evaluations of several state-of-the-art (SOTA) VLMs, we reveal their limitations in embodied UAV tasks, underscoring the critical role of the BEDI benchmark in advancing embodied intelligence research and model optimization. By filling the gap in systematic and standardized evaluation within this field, BEDI facilitates objective model comparison and lays a robust foundation for future development in this field. Our benchmark is now publicly available at https://github.com/lostwolves/BEDI.


Optimal Safety-Aware Scheduling for Multi-Agent Aerial 3D Printing with Utility Maximization under Dependency Constraints

Stamatopoulos, Marios-Nektarios, Velhal, Shridhar, Banerjee, Avijit, Nikolakopoulos, George

arXiv.org Artificial Intelligence

Abstract--This article presents a novel coordination and task-planning framework to enable the simultaneous conflict-free collaboration of multiple unmanned aerial vehicles (UA Vs) for aerial 3D printing. The proposed framework formulates an optimization problem that takes a construction mission divided into sub-tasks and a team of autonomous UA Vs, along with limited volume and battery. It generates an optimal mission plan comprising task assignments and scheduling, while accounting for task dependencies arising from the geometric and structural requirements of the 3D design, inter-UA V safety constraints, material usage and total flight time of each UA V. The potential conflicts occurring during the simultaneous operation of the UA Vs are addressed at a segment-level by dynamically selecting the starting time and location of each task to guarantee collision-free parallel execution. An importance prioritization is proposed to accelerate the computation by guiding the solution towards more important tasks. Additionally, a utility maximization formulation is proposed to dynamically determine the optimal number of UA Vs required for a given mission, balancing the trade-off between minimizing makespan and the deployment of excess agents. The proposed framework's effectiveness is evaluated through a Gazebo-based simulation setup, where agents are coordinated by a mission control module allocating the printing tasks based on the generated optimal scheduling plan while remaining within the material and battery constraints of each UA V. A video of the whole mission is available in the following link: https://youtu.be/b4jwhkNPT Note to Practitioners--This framework addresses the critical need for efficiency and safety in planning and scheduling multiple aerial robots for parallel aerial 3D printing. Existing approaches lack safety guarantees for UA Vs during parallel construction. This work tackles these challenges by ensuring safety during parallel operations and effectively managing task dependencies.


Mobility Induced Sensitivity of UAV based Nodes to Jamming in Private 5G Airfield Networks An Experimental Study

Mykytyn, Pavlo, Chitauro, Ronald, Yener, Onur, Langendoerfer, Peter

arXiv.org Artificial Intelligence

This work presents an e xperimental performance evaluation of a p rivate 5G a irfield n etwork under controlled directional SDR jamming attacks targeting UAV - based UE nodes . Using a QualiPoc Android UE, mounted as a payload on a quad-copter UAV, we conducted a series of experiments to evaluate signal degradation, handover performance, and service stability in the presence of constant directional jamming. The conducted experiments aimed to examin e the effe c t s of varying travel speed s, altitudes, and moving patterns of a UAV - based UE to record and analyze the key physical - layer and network - layer metrics such as CQI, MCS, RSRP, SINR, BLER, Net PDSCH Throughput and RLF. The results of this work describe the link stability and signal degradation dependencies, caused by the level of mobility of the UAV - based UE nodes during autonomous and automatic operation in private 5G Airfield networks.


Agentic UAVs: LLM-Driven Autonomy with Integrated Tool-Calling and Cognitive Reasoning

Koubaa, Anis, Gabr, Khaled

arXiv.org Artificial Intelligence

Unmanned Aerial Vehicles (UAVs) are increasingly used in defense, surveillance, and disaster response, yet most systems still operate at SAE Level 2 to 3 autonomy. Their dependence on rule-based control and narrow AI limits adaptability in dynamic and uncertain missions. Current UAV architectures lack context-aware reasoning, autonomous decision-making, and integration with external systems. Importantly, none make use of Large Language Model (LLM) agents with tool-calling for real-time knowledge access. This paper introduces the Agentic UAVs framework, a five-layer architecture consisting of Perception, Reasoning, Action, Integration, and Learning. The framework enhances UAV autonomy through LLM-driven reasoning, database querying, and interaction with third-party systems. A prototype built with ROS 2 and Gazebo combines YOLOv11 for object detection with GPT-4 for reasoning and a locally deployed Gemma 3 model. In simulated search-and-rescue scenarios, agentic UAVs achieved higher detection confidence (0.79 compared to 0.72), improved person detection rates (91% compared to 75%), and a major increase in correct action recommendations (92% compared to 4.5%). These results show that modest computational overhead can enable significantly higher levels of autonomy and system-level integration.


Multi-UAV Swarm Obstacle Avoidance Based on Potential Field Optimization

Hu, Yendo, Wu, Yiliang, Chen, Weican

arXiv.org Artificial Intelligence

In multi UAV scenarios,the traditional Artificial Potential Field (APF) method often leads to redundant flight paths and frequent abrupt heading changes due to unreasonable obstacle avoidance path planning,and is highly prone to inter UAV collisions during the obstacle avoidance process.To address these issues,this study proposes a novel hybrid algorithm that combines the improved Multi-Robot Formation Obstacle Avoidance (MRF IAPF) algorithm with an enhanced APF optimized for single UAV path planning.Its core ideas are as follows:first,integrating three types of interaction forces from MRF IAPF obstacle repulsion force,inter UAV interaction force,and target attraction force;second,incorporating a refined single UAV path optimization mechanism,including collision risk assessment and an auxiliary sub goal strategy.When a UAV faces a high collision threat,temporary waypoints are generated to guide obstacle avoidance,ensuring eventual precise arrival at the actual target.Simulation results demonstrate that compared with traditional APF based formation algorithms,the proposed algorithm achieves significant improvements in path length optimization and heading stability,can effectively avoid obstacles and quickly restore the formation configuration,thus verifying its applicability and effectiveness in static environments with unknown obstacles.


Communication-Aware Asynchronous Distributed Trajectory Optimization for UAV Swarm

Yu, Yue, Zheng, Xiaobo, He, Shaoming

arXiv.org Artificial Intelligence

UAV swarms have emerged as transformative systems for complex missions including wildfire surveillance ( Julian and Kochenderfer 2019), intelligence surveillance and reconnaissance ( Kolar 2020), situational awareness ( Scharre 2018), and cooperative interception ( Balhance et al. 2017). In these applications, trajectory optimization is the cornerstone for ensuring both mission success and operational s afety ( Sezer 2022; Qian et al. 2020; Sanchez-Lopez et al. 2020). Over the past decade, trajectory optimization techniques hav e evolved from sophisticated single-agent formulations to distributed multi-agent frameworks, driven by the increasing scale and complexity of swarm-based missions ( Saravanos et al. 2023). For individual UAV trajectory optimization, a variety of numerical m ethods have demonstrated strong performance. Pseudospectral methods achieve high-accuracy solution s by discretizing continuous-time problems ( Chai et al. 2017), while sequential quadratic programming (SQP) ( Hong et al. 2021) and sequential convex programming (SCP) ( Deligiannis et al. 2019) provide flexible tools for handling nonlinear dynamics and constraint s.


Long Duration Inspection of GNSS-Denied Environments with a Tethered UAV-UGV Marsupial System

Martínez-Rozas, Simón, Alejo, David, Carpio, José Javier, Caballero, Fernando, Merino, Luis

arXiv.org Artificial Intelligence

Unmanned Aerial Vehicles (UAVs) have become essential tools in inspection and emergency response operations due to their high maneuverability and ability to access hard-to-reach areas. However, their limited battery life significantly restricts their use in long-duration missions. This paper presents a tethered marsupial robotic system composed of a UAV and an Unmanned Ground Vehicle (UGV), specifically designed for autonomous, long-duration inspection tasks in Global Navigation Satellite System (GNSS)-denied environments. The system extends the UAV's operational time by supplying power through a tether connected to high-capacity battery packs carried by the UGV. Our work details the hardware architecture based on off-the-shelf components to ensure replicability and describes our full-stack software framework used by the system, which is composed of open-source components and built upon the Robot Operating System (ROS). The proposed software architecture enables precise localization using a Direct LiDAR Localization (DLL) method and ensures safe path planning and coordinated trajectory tracking for the integrated UGV-tether-UAV system. We validate the system through three sets of field experiments involving (i) three manual flight endurance tests to estimate the operational duration, (ii) three experiments for validating the localization and the trajectory tracking systems, and (iii) three executions of an inspection mission to demonstrate autonomous inspection capabilities. The results of the experiments confirm the robustness and autonomy of the system in GNSS-denied environments. Finally, all experimental data have been made publicly available to support reproducibility and to serve as a common open dataset for benchmarking.