trajectory planning
Bayesian Active Inference for Intelligent UAV Anti-Jamming and Adaptive Trajectory Planning
Krayani, Ali, Sadati, Seyedeh Fatemeh, Marcenaro, Lucio, Regazzoni, Carlo
Abstract--This paper proposes a hierarchical trajectory planning framework for UA Vs operating under adversarial jamming conditions. Leveraging Bayesian Active Inference, the approach combines expert-generated demonstrations with probabilistic generative modeling to encode high-level symbolic planning, low-level motion policies, and wireless signal feedback. During deployment, the UA V performs online inference to anticipate interference, localize jammers, and adapt its trajectory accordingly--without prior knowledge of jammer locations. Simulation results demonstrate that the proposed method achieves near-expert performance, significantly reducing communication interference and mission cost compared to model-free reinforcement learning baselines, while maintaining robust generalization in dynamic environments. Unmanned Aerial V ehicles (UA Vs) play a crucial role in military, public, and civilian applications due to their compact size, flexible deployment capabilities, and outstanding performance.
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)
Multi-Agent Reinforcement Learning and Real-Time Decision-Making in Robotic Soccer for Virtual Environments
The deployment of multi-agent systems in dynamic, adversarial environments like robotic soccer necessitates real-time decision-making, sophisticated cooperation, and scalable algorithms to avoid the curse of dimensionality. While Reinforcement Learning (RL) offers a promising framework, existing methods often struggle with the multi-granularity of tasks (long-term strategy vs. instant actions) and the complexity of large-scale agent interactions. This paper presents a unified Multi-Agent Reinforcement Learning (MARL) framework that addresses these challenges. First, we establish a baseline using Proximal Policy Optimization (PPO) within a client-server architecture for real-time action scheduling, with PPO demonstrating superior performance (4.32 avg. goals, 82.9% ball control). Second, we introduce a Hierarchical RL (HRL) structure based on the options framework to decompose the problem into a high-level trajectory planning layer (modeled as a Semi-Markov Decision Process) and a low-level action execution layer, improving global strategy (avg. goals increased to 5.26). Finally, to ensure scalability, we integrate mean-field theory into the HRL framework, simplifying many-agent interactions into a single agent vs. the population average. Our mean-field actor-critic method achieves a significant performance boost (5.93 avg. goals, 89.1% ball control, 92.3% passing accuracy) and enhanced training stability. Extensive simulations of 4v4 matches in the Webots environment validate our approach, demonstrating its potential for robust, scalable, and cooperative behavior in complex multi-agent domains.
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Architecture > Real Time Systems (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)
Foundation Models for Trajectory Planning in Autonomous Driving: A Review of Progress and Open Challenges
Oksuz, Kemal, Buburuzan, Alexandru, Knittel, Anthony, Yao, Yuhan, Dokania, Puneet K.
The emergence of multi-modal foundation models has markedly transformed the technology for autonomous driving, shifting away from conventional and mostly hand-crafted design choices towards unified, foundation-model-based approaches, capable of directly inferring motion trajectories from raw sensory inputs. This new class of methods can also incorporate natural language as an additional modality, with Vision-Language-Action (VLA) models serving as a representative example. In this review, we provide a comprehensive examination of such methods through a unifying taxonomy to critically evaluate their architectural design choices, methodological strengths, and their inherent capabilities and limitations. Our survey covers 37 recently proposed approaches that span the landscape of trajectory planning with foundation models. Furthermore, we assess these approaches with respect to the openness of their source code and datasets, offering valuable information to practitioners and researchers. We provide an accompanying webpage that catalogs the methods based on our taxonomy, available at: https://github.com/fiveai/FMs-for-driving-trajectories
- Research Report (1.00)
- Overview (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
AutoDrive-R$^2$: Incentivizing Reasoning and Self-Reflection Capacity for VLA Model in Autonomous Driving
Yuan, Zhenlong, Qian, Chengxuan, Tang, Jing, Chen, Rui, Song, Zijian, Sun, Lei, Chu, Xiangxiang, Cai, Yujun, Zhang, Dapeng, Li, Shuo
Vision-Language-Action (VLA) models in autonomous driving systems have recently demonstrated transformative potential by integrating multimodal perception with decision-making capabilities. However, the interpretability and coherence of the decision process and the plausibility of action sequences remain largely underexplored. To address these issues, we propose AutoDrive-R$^2$, a novel VLA framework that enhances both reasoning and self-reflection capabilities of autonomous driving systems through chain-of-thought (CoT) processing and reinforcement learning (RL). Specifically, we first propose an innovative CoT dataset named nuScenesR$^2$-6K for supervised fine-tuning, which effectively builds cognitive bridges between input information and output trajectories through a four-step logical chain with self-reflection for validation. Moreover, to maximize both reasoning and self-reflection during the RL stage, we further employ the Group Relative Policy Optimization (GRPO) algorithm within a physics-grounded reward framework that incorporates spatial alignment, vehicle dynamic, and temporal smoothness criteria to ensure reliable and realistic trajectory planning. Extensive evaluation results across both nuScenes and Waymo datasets demonstrates the state-of-the-art performance and robust generalization capacity of our proposed method.
- Oceania > Australia > Queensland (0.04)
- Asia > China > Gansu Province > Lanzhou (0.04)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Information Technology > Robotics & Automation (0.84)
SUPER-AD: Semantic Uncertainty-aware Planning for End-to-End Robust Autonomous Driving
Ryu, Wonjeong, Yu, Seungjun, Moon, Seokha, Choi, Hojun, Park, Junsung, Kim, Jinkyu, Shim, Hyunjung
End-to-End (E2E) planning has become a powerful paradigm for autonomous driving, yet current systems remain fundamentally uncertainty-blind. They assume perception outputs are fully reliable, even in ambiguous or poorly observed scenes, leaving the planner without an explicit measure of uncertainty. To address this limitation, we propose a camera-only E2E framework that estimates aleatoric uncertainty directly in BEV space and incorporates it into planning. Our method produces a dense, uncertainty-aware drivability map that captures both semantic structure and geometric layout at pixel-level resolution. To further promote safe and rule-compliant behavior, we introduce a lane-following regularization that encodes lane structure and traffic norms. This prior stabilizes trajectory planning under normal conditions while preserving the flexibility needed for maneuvers such as overtaking or lane changes. Together, these components enable robust and interpretable trajectory planning, even under challenging uncertainty conditions. Evaluated on the NAVSIM benchmark, our method achieves state-of-the-art performance, delivering substantial gains on both the challenging NAVHARD and NAVSAFE subsets. These results demonstrate that our principled aleatoric uncertainty modeling combined with driving priors significantly advances the safety and reliability of camera-only E2E autonomous driving.
- Transportation > Ground > Road (0.95)
- Information Technology > Robotics & Automation (0.85)
- Automobiles & Trucks (0.85)
Safe and Economical UAV Trajectory Planning in Low-Altitude Airspace: A Hybrid DRL-LLM Approach with Compliance Awareness
Gong, Yanwei, Fan, Junchao, Zhang, Ruichen, Niyato, Dusit, Yao, Yingying, Chang, Xiaolin
The rapid growth of the low-altitude economy has driven the widespread adoption of unmanned aerial vehicles (UAVs). This growing deployment presents new challenges for UAV trajectory planning in complex urban environments. However, existing studies often overlook key factors, such as urban airspace constraints and economic efficiency, which are essential in low-altitude economy contexts. Deep reinforcement learning (DRL) is regarded as a promising solution to these issues, while its practical adoption remains limited by low learning efficiency. To overcome this limitation, we propose a novel UAV trajectory planning framework that combines DRL with large language model (LLM) reasoning to enable safe, compliant, and economically viable path planning. Experimental results demonstrate that our method significantly outperforms existing baselines across multiple metrics, including data collection rate, collision avoidance, successful landing, regulatory compliance, and energy efficiency. These results validate the effectiveness of our approach in addressing UAV trajectory planning key challenges under constraints of the low-altitude economy networking.
- Transportation (1.00)
- Energy (1.00)
- Information Technology > Robotics & Automation (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- (5 more...)
From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction
Zhao, Zhida, Fu, Talas, Wang, Yifan, Wang, Lijun, Lu, Huchuan
Despite remarkable progress in driving world models, their potential for autonomous systems remains largely untapped: the world models are mostly learned for world simulation and decoupled from trajectory planning. While recent efforts aim to unify world modeling and planning in a single framework, the synergistic facilitation mechanism of world modeling for planning still requires further exploration. In this work, we introduce a new driving paradigm named Policy World Model (PWM), which not only integrates world modeling and trajectory planning within a unified architecture, but is also able to benefit planning using the learned world knowledge through the proposed action-free future state forecasting scheme. Through collaborative state-action prediction, PWM can mimic the human-like anticipatory perception, yielding more reliable planning performance. To facilitate the efficiency of video forecasting, we further introduce a dynamically enhanced parallel token generation mechanism, equipped with a context-guided tokenizer and an adaptive dynamic focal loss. Despite utilizing only front camera input, our method matches or exceeds state-of-the-art approaches that rely on multi-view and multi-modal inputs. Code and model weights will be released at https://github.com/6550Zhao/Policy-World-Model.
LEARN: Learning End-to-End Aerial Resource-Constrained Multi-Robot Navigation
Chiu, Darren, Huang, Zhehui, Ge, Ruohai, Sukhatme, Gaurav S.
Figure 1: LEARN is a lightweight, two-stage safety-guided reinforcement learning framework for multi-UA V navigation in cluttered indoor and outdoor spaces. All processes, including perception, localization, communication, planning, and control, run purely on an embedded single-core controller running at 168 MHz with 192 KB of RAM. A single policy is trained in simulation and duplicated across all quadrotors. During deployment, a minimum snap naive planner produces goal points for the encoder. Quadrotors obtain the two closest neighbor positions and velocities through radio; and obstacles are sensed using a low dimensional time-of-flight sensor. The policy generates individual normalized rotor thrusts that are sent directly to the motors. LEARN is zero-shot transferable to the real world with no fine-tuning. Experiments show that it scales up to 6 quadrotors in the real world and 24 in simulation. Abstract--Nano-UA V teams offer great agility yet face severe navigation challenges due to constrained onboard sensing, communication, and computation. Existing approaches rely on high-resolution vision or compute-intensive planners, rendering them infeasible for these platforms. All authors are with the University of Southern California. Our system combines low-resolution Time-of-Flight (T oF) sensors and a simple motion planner with a compact, attention-based RL policy. In simulation, LEARN outperforms two state-of-the-art planners by 10% while using substantially fewer resources. We demonstrate LEARN's viability on six Crazyflie quadro-tors, achieving fully onboard flight in diverse indoor and outdoor environments at speeds up to 2.0m/s and traversing 0.2m gaps. EDG-Team switches to a centralized and synchronous planner in dense environments [6]. Nmanned aerial vehicles (UA Vs) are increasingly used in domains such as surveillance [1], search and rescue [2], and planetary exploration [3]. The physics of flight impose stringent size, weight, and power (SWaP) constraints on these platforms, making efficient system design paramount. While autonomy in UA Vs has advanced significantly, many state-of-the-art navigation approaches fail to scale to resource-constrained platforms.
- North America > United States > California (0.54)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Information Technology (0.68)
- Leisure & Entertainment (0.67)
- Energy (0.46)
Flatness-based trajectory planning for 3D overhead cranes with friction compensation and collision avoidance
Vicente-Martinez, Jorge, Ramirez-Laboreo, Edgar
Abstract--This paper presents an optimal trajectory generation method for 3D overhead cranes by leveraging differential flatness. This framework enables the direct inclusion of complex physical and dynamic constraints, such as nonlinear friction and collision avoidance for both payload and rope. Our approach allows for aggressive movements by constraining payload swing only at the final point. A comparative simulation study validates our approach, demonstrating that neglecting dry friction leads to actuator saturation and collisions. The results show that friction modeling is a fundamental requirement for fast and safe crane trajectories.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada > British Columbia > Vancouver (0.04)
- Europe > Spain > Aragón > Zaragoza Province > Zaragoza (0.04)
- (6 more...)
A Parameter-Linear Formulation of the Optimal Path Following Problem for Robotic Manipulator
Marauli, Tobias, Gattringer, Hubert, Mueller, Andreas
In this paper the computational challenges of time-optimal path following are addressed. The standard approach is to minimize the travel time, which inevitably leads to singularities at zero path speed, when reformulating the optimization problem in terms of a path parameter. Thus, smooth trajectory generation while maintaining a low computational effort is quite challenging, since the singularities have to be taken into account. To this end, a different approach is presented in this paper. This approach is based on maximizing the path speed along a prescribed path. Furthermore, the approach is capable of planning smooth trajectories numerically efficient. Moreover, the discrete reformulation of the underlying problem is linear in optimization variables.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Austria > Upper Austria > Linz (0.04)
- Asia > Taiwan > Taiwan Province > Taipei (0.04)