Goto

Collaborating Authors

 uav


Multi-Task Bayesian Optimization for Tuning Decentralized Trajectory Generation in Multi-UAV Systems

Manzoni, Marta, Nazzari, Alessandro, Rubinacci, Roberto, Lovera, Marco

arXiv.org Artificial Intelligence

We treat each task as a trajectory generation scenario defined by a specific number of drone-to-drone interactions. To model relationships across scenarios, we employ Multi-Task Gaussian Processes, which capture shared structure across tasks and enable efficient information transfer during optimization. We compare two strategies: optimizing the average mission time across all tasks and optimizing each task individually. Through a comprehensive simulation campaign, we show that single-task optimization leads to progressively shorter mission times as swarm size grows, but requires significantly more optimization time than the average-task approach. Keywords: Multi-Task Bayesian Optimization; Gaussian Processes; Multi-agent systems; UAV; Trajectory generation 1. INTRODUCTION In recent years, research efforts and real-world applications of Unmanned Aerial Vehicles (UAVs) have increasingly shifted from single-agent to multi-agent systems.


Chat with UAV -- Human-UAV Interaction Based on Large Language Models

Wang, Haoran, Chen, Zhuohang, Li, Guang, Ma, Bo, Li, Chuanghuang

arXiv.org Artificial Intelligence

The future of UAV interaction systems is evolving from engineer-driven to user-driven, aiming to replace traditional predefined Human-UAV Interaction designs. This shift focuses on enabling more personalized task planning and design, thereby achieving a higher quality of interaction experience and greater flexibility, which can be used in many fileds, such as agriculture, aerial photography, logistics, and environmental monitoring. However, due to the lack of a common language between users and the UAVs, such interactions are often difficult to be achieved. The developments of Large Language Models possess the ability to understand nature languages and Robots' (UAVs') behaviors, marking the possibility of personalized Human-UAV Interaction. Recently, some HUI frameworks based on LLMs have been proposed, but they commonly suffer from difficulties in mixed task planning and execution, leading to low adaptability in complex scenarios. In this paper, we propose a novel dual-agent HUI framework. This framework constructs two independent LLM agents (a task planning agent, and an execution agent) and applies different Prompt Engineering to separately handle the understanding, planning, and execution of tasks. To verify the effectiveness and performance of the framework, we have built a task database covering four typical application scenarios of UAVs and quantified the performance of the HUI framework using three independent metrics. Meanwhile different LLM models are selected to control the UAVs with compared performance. Our user study experimental results demonstrate that the framework improves the smoothness of HUI and the flexibility of task execution in the tasks scenario we set up, effectively meeting users' personalized needs.


IGUANA: Immersive Guidance, Navigation, and Control for Consumer UAV

Victor, Victor, Krisanty, Tania, McGinity, Matthew, Gumhold, Stefan, Aßmann, Uwe

arXiv.org Artificial Intelligence

As the markets for unmanned aerial vehicles (UAVs) and mixed reality (MR) headsets continue to grow, recent research has increasingly explored their integration, which enables more intuitive, immersive, and situationally aware control systems. We present IGUANA, an MR-based immersive guidance, navigation, and control system for consumer UAVs. IGUANA introduces three key elements beyond conventional control interfaces: (1) a 3D terrain map interface with draggable waypoint markers and live camera preview for high-level control, (2) a novel spatial control metaphor that uses a virtual ball as a physical analogy for low-level control, and (3) a spatial overlay that helps track the UAV when it is not visible with the naked eye or visual line of sight is interrupted. We conducted a user study to evaluate our design, both quantitatively and qualitatively, and found that (1) the 3D map interface is intuitive and easy to use, relieving users from manual control and suggesting improved accuracy and consistency with lower perceived workload relative to conventional dual-stick controller, (2) the virtual ball interface is intuitive but limited by the lack of physical feedback, and (3) the spatial overlay is very useful in enhancing the users' situational awareness.


BEDI: A Comprehensive Benchmark for Evaluating Embodied Agents on UAVs

Guo, Mingning, Wu, Mengwei, He, Jiarun, Li, Shaoxian, Li, Haifeng, Tao, Chao

arXiv.org Artificial Intelligence

With the rapid advancement of low-altitude remote sensing and Vision-Language Models (VLMs), Embodied Agents based on Unmanned Aerial Vehicles (UAVs) have shown significant potential in autonomous tasks. However, current evaluation methods for UAV-Embodied Agents (UAV-EAs) remain constrained by the lack of standardized benchmarks, diverse testing scenarios and open system interfaces. To address these challenges, we propose BEDI (Benchmark for Embodied Drone Intelligence), a systematic and standardized benchmark designed for evaluating UAV-EAs. Specifically, we introduce a novel Dynamic Chain-of-Embodied-Task paradigm based on the perception-decision-action loop, which decomposes complex UAV tasks into standardized, measurable subtasks. Building on this paradigm, we design a unified evaluation framework encompassing six core sub-skills: semantic perception, spatial perception, motion control, tool utilization, task planning and action generation. Furthermore, we develop a hybrid testing platform that incorporates a wide range of both virtual and real-world scenarios, enabling a comprehensive evaluation of UAV-EAs across diverse contexts. The platform also offers open and standardized interfaces, allowing researchers to customize tasks and extend scenarios, thereby enhancing flexibility and scalability in the evaluation process. Finally, through empirical evaluations of several state-of-the-art (SOTA) VLMs, we reveal their limitations in embodied UAV tasks, underscoring the critical role of the BEDI benchmark in advancing embodied intelligence research and model optimization. By filling the gap in systematic and standardized evaluation within this field, BEDI facilitates objective model comparison and lays a robust foundation for future development in this field. Our benchmark is now publicly available at https://github.com/lostwolves/BEDI.


Optimal Safety-Aware Scheduling for Multi-Agent Aerial 3D Printing with Utility Maximization under Dependency Constraints

Stamatopoulos, Marios-Nektarios, Velhal, Shridhar, Banerjee, Avijit, Nikolakopoulos, George

arXiv.org Artificial Intelligence

Abstract--This article presents a novel coordination and task-planning framework to enable the simultaneous conflict-free collaboration of multiple unmanned aerial vehicles (UA Vs) for aerial 3D printing. The proposed framework formulates an optimization problem that takes a construction mission divided into sub-tasks and a team of autonomous UA Vs, along with limited volume and battery. It generates an optimal mission plan comprising task assignments and scheduling, while accounting for task dependencies arising from the geometric and structural requirements of the 3D design, inter-UA V safety constraints, material usage and total flight time of each UA V. The potential conflicts occurring during the simultaneous operation of the UA Vs are addressed at a segment-level by dynamically selecting the starting time and location of each task to guarantee collision-free parallel execution. An importance prioritization is proposed to accelerate the computation by guiding the solution towards more important tasks. Additionally, a utility maximization formulation is proposed to dynamically determine the optimal number of UA Vs required for a given mission, balancing the trade-off between minimizing makespan and the deployment of excess agents. The proposed framework's effectiveness is evaluated through a Gazebo-based simulation setup, where agents are coordinated by a mission control module allocating the printing tasks based on the generated optimal scheduling plan while remaining within the material and battery constraints of each UA V. A video of the whole mission is available in the following link: https://youtu.be/b4jwhkNPT Note to Practitioners--This framework addresses the critical need for efficiency and safety in planning and scheduling multiple aerial robots for parallel aerial 3D printing. Existing approaches lack safety guarantees for UA Vs during parallel construction. This work tackles these challenges by ensuring safety during parallel operations and effectively managing task dependencies.


Mobility Induced Sensitivity of UAV based Nodes to Jamming in Private 5G Airfield Networks An Experimental Study

Mykytyn, Pavlo, Chitauro, Ronald, Yener, Onur, Langendoerfer, Peter

arXiv.org Artificial Intelligence

This work presents an e xperimental performance evaluation of a p rivate 5G a irfield n etwork under controlled directional SDR jamming attacks targeting UAV - based UE nodes . Using a QualiPoc Android UE, mounted as a payload on a quad-copter UAV, we conducted a series of experiments to evaluate signal degradation, handover performance, and service stability in the presence of constant directional jamming. The conducted experiments aimed to examin e the effe c t s of varying travel speed s, altitudes, and moving patterns of a UAV - based UE to record and analyze the key physical - layer and network - layer metrics such as CQI, MCS, RSRP, SINR, BLER, Net PDSCH Throughput and RLF. The results of this work describe the link stability and signal degradation dependencies, caused by the level of mobility of the UAV - based UE nodes during autonomous and automatic operation in private 5G Airfield networks.


Agentic UAVs: LLM-Driven Autonomy with Integrated Tool-Calling and Cognitive Reasoning

Koubaa, Anis, Gabr, Khaled

arXiv.org Artificial Intelligence

Unmanned Aerial Vehicles (UAVs) are increasingly used in defense, surveillance, and disaster response, yet most systems still operate at SAE Level 2 to 3 autonomy. Their dependence on rule-based control and narrow AI limits adaptability in dynamic and uncertain missions. Current UAV architectures lack context-aware reasoning, autonomous decision-making, and integration with external systems. Importantly, none make use of Large Language Model (LLM) agents with tool-calling for real-time knowledge access. This paper introduces the Agentic UAVs framework, a five-layer architecture consisting of Perception, Reasoning, Action, Integration, and Learning. The framework enhances UAV autonomy through LLM-driven reasoning, database querying, and interaction with third-party systems. A prototype built with ROS 2 and Gazebo combines YOLOv11 for object detection with GPT-4 for reasoning and a locally deployed Gemma 3 model. In simulated search-and-rescue scenarios, agentic UAVs achieved higher detection confidence (0.79 compared to 0.72), improved person detection rates (91% compared to 75%), and a major increase in correct action recommendations (92% compared to 4.5%). These results show that modest computational overhead can enable significantly higher levels of autonomy and system-level integration.


RSPECT: Robust and Scalable Planner for Energy-Aware Coordination of UAV-UGV Teams in Aerial Monitoring

Er, Cahit Ikbal, Kashiri, Amin, Yazicioglu, Yasin

arXiv.org Artificial Intelligence

We consider the robust planning of energy-constrained unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs), which act as mobile charging stations, to perform long-horizon aerial monitoring missions. More specifically, given a set of points to be visited by the UAVs and desired final positions of the UAV-UGV teams, the objective is to find a robust plan (the vehicle trajectories) that can be realized without a major revision in the face of uncertainty (e.g., unknown obstacles/terrain, wind) to complete this mission in minimum time. We provide a formal description of this problem as a mixed-integer program (MIP), which is NP-hard. Since exact solution methods are computationally intractable for such problems, we propose RSPECT, a scalable and efficient heuristic. We provide theoretical results on the complexity of our algorithm and the feasibility and robustness of resulting plans. We also demonstrate the performance of our method via simulations and experiments.


MonoSpheres: Large-Scale Monocular SLAM-Based UAV Exploration through Perception-Coupled Mapping and Planning

Musil, Tomáš, Petrlík, Matěj, Saska, Martin

arXiv.org Artificial Intelligence

Autonomous exploration of unknown environments is a key capability for mobile robots, but it is largely unsolved for robots equipped with only a single monocular camera and no dense range sensors. In this paper, we present a novel approach to monocular vision-based exploration that can safely cover large-scale unstructured indoor and outdoor 3D environments by explicitly accounting for the properties of a sparse monocular SLAM frontend in both mapping and planning. The mapping module solves the problems of sparse depth data, free-space gaps, and large depth uncertainty by oversampling free space in texture-sparse areas and keeping track of obstacle position uncertainty. The planning module handles the added free-space uncertainty through rapid replanning and perception-aware heading control. We further show that frontier-based exploration is possible with sparse monocular depth data when parallax requirements and the possibility of textureless surfaces are taken into account. We evaluate our approach extensively in diverse real-world and simulated environments, including ablation studies. To the best of the authors' knowledge, the proposed method is the first to achieve 3D monocular exploration in real-world unstructured outdoor environments. We open-source our implementation to support future research.


Multi-UAV Swarm Obstacle Avoidance Based on Potential Field Optimization

Hu, Yendo, Wu, Yiliang, Chen, Weican

arXiv.org Artificial Intelligence

In multi UAV scenarios,the traditional Artificial Potential Field (APF) method often leads to redundant flight paths and frequent abrupt heading changes due to unreasonable obstacle avoidance path planning,and is highly prone to inter UAV collisions during the obstacle avoidance process.To address these issues,this study proposes a novel hybrid algorithm that combines the improved Multi-Robot Formation Obstacle Avoidance (MRF IAPF) algorithm with an enhanced APF optimized for single UAV path planning.Its core ideas are as follows:first,integrating three types of interaction forces from MRF IAPF obstacle repulsion force,inter UAV interaction force,and target attraction force;second,incorporating a refined single UAV path optimization mechanism,including collision risk assessment and an auxiliary sub goal strategy.When a UAV faces a high collision threat,temporary waypoints are generated to guide obstacle avoidance,ensuring eventual precise arrival at the actual target.Simulation results demonstrate that compared with traditional APF based formation algorithms,the proposed algorithm achieves significant improvements in path length optimization and heading stability,can effectively avoid obstacles and quickly restore the formation configuration,thus verifying its applicability and effectiveness in static environments with unknown obstacles.