AITopics

doi: 10.3233/FAIA251394

2507.19109

Country:

Europe (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

arXiv.org Artificial IntelligenceNov-3-2025

Heterogeneous Robot Collaboration in Unstructured Environments with Grounded Generative Intelligence

Ravichandran, Zachary, Cladera, Fernando, Prabhu, Ankit, Hughes, Jason, Murali, Varun, Taylor, Camillo, Pappas, George J., Kumar, Vijay

Heterogeneous robot teams operating in realistic settings often must accomplish complex missions requiring collaboration and adaptation to information acquired online. Because robot teams frequently operate in unstructured environments -- uncertain, open-world settings without prior maps -- subtasks must be grounded in robot capabilities and the physical world. While heterogeneous teams have typically been designed for fixed specifications, generative intelligence opens the possibility of teams that can accomplish a wide range of missions described in natural language. However, current large language model (LLM)-enabled teaming methods typically assume well-structured and known environments, limiting deployment in unstructured environments. We present SPINE-HT, a framework that addresses these limitations by grounding the reasoning abilities of LLMs in the context of a heterogeneous robot team through a three-stage process. Given language specifications describing mission goals and team capabilities, an LLM generates grounded subtasks which are validated for feasibility. Subtasks are then assigned to robots based on capabilities such as traversability or perception and refined given feedback collected during online operation. In simulation experiments with closed-loop perception and control, our framework achieves nearly twice the success rate compared to prior LLM-enabled heterogeneous teaming approaches. In real-world experiments with a Clearpath Jackal, a Clearpath Husky, a Boston Dynamics Spot, and a high-altitude UAV, our method achieves an 87\% success rate in missions requiring reasoning about robot capabilities and refining subtasks with online feedback. More information is provided at https://zacravichandran.github.io/SPINE-HT.

artificial intelligence, large language model, natural language, (20 more...)

2510.26915

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Games (0.75)
Energy (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)

arXiv.org Artificial IntelligenceOct-31-2025

Hybrid DQN-TD3 Reinforcement Learning for Autonomous Navigation in Dynamic Environments

He, Xiaoyi, Chen, Danggui, Zhang, Zhenshuo, Bai, Zimeng

This paper presents a hierarchical path-planning and control framework that combines a high-level Deep Q-Network (DQN) for discrete sub-goal selection with a low-level Twin Delayed Deep Deterministic Policy Gradient (TD3) controller for continuous actuation. The high-level module selects behaviors and sub-goals; the low-level module executes smooth velocity commands. We design a practical reward shaping scheme (direction, distance, obstacle avoidance, action smoothness, collision penalty, time penalty, and progress), together with a LiDAR-based safety gate that prevents unsafe motions. The system is implemented in ROS + Gazebo (TurtleBot3) and evaluated with PathBench metrics, including success rate, collision rate, path efficiency, and re-planning efficiency, in dynamic and partially observable environments. Experiments show improved success rate and sample efficiency over single-algorithm baselines (DQN or TD3 alone) and rule-based planners, with better generalization to unseen obstacle configurations and reduced abrupt control changes. Code and evaluation scripts are available at the project repository.

machine learning, reinforcement, reinforcement learning, (14 more...)

2510.26646

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceOct-31-2025

FLYINGTRUST: A Benchmark for Quadrotor Navigation Across Scenarios and Vehicles

Li, Gang, Zhai, Chunlei, Wang, Teng, Li, Shaun, Jiang, Shangsong, Zhu, Xiangwei

Abstract--Visual navigation algorithms for quadrotors often exhibit a large variation in performance when transferred across different vehicle platforms and scene geometries, which increases the cost and risk of field deployment. T o support systematic early-stage evaluation, we introduce FL YINGTRUST, a high-fidelity, configurable benchmarking framework that measures how platform kinodynamics and scenario structure jointly affect navigation robustness. The benchmark pairs a diverse scenario library with a heterogeneous set of real and virtual platforms and prescribes a standardized evaluation protocol together with a composite scoring method that balances scenario importance, platform importance and performance stability. We use FL YINGTRUST to compare representative optimization-based and learning-based navigation approaches under identical conditions, performing repeated trials per platform-scenario combination and reporting uncertainty-aware metrics. The results reveal systematic patterns: navigation success depends predictably on platform capability and scene geometry, and different algorithms exhibit distinct preferences and failure modes across the evaluated conditions. These observations highlight the practical necessity of incorporating both platform capability and scenario structure into algorithm design, evaluation, and selection, and they motivate future work on methods that remain robust across diverse platforms and scenarios. NMANNED Aerial V ehicles (UA Vs) are aircraft operated without onboard human pilots, either by remote control or by preprogrammed flight plans [1]. By independently modulating the speeds of four motor-propeller units, a quadrotor can generate collective thrust for vertical motion and differential thrust and reaction torques for attitude control. These capabilities enable six degrees of freedom motion combined with fine low-speed control, which drive extensive adoption of quadrotors in precision agriculture, infrastructure inspection, high-resolution mapping, environmental monitoring and disaster response [2]-[11]. The benchmark of FL YINGTRUST is available at https://github.com/ The blue line represents the straight-line reference path, and the red curve is an example of a collision-free trajectory executed by a planner. Over the last decade, many high-performance visual navigation methods have been developed, ranging from classical optimization-based planners to recent learning-based approaches [12]-[15].

artificial intelligence, machine learning, planning & scheduling, (18 more...)

2510.26588

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry:

Transportation > Air (1.00)
Aerospace & Defense (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.66)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.47)

Kapoor, Vansh, Nair, Jayakrishnan

MDPs with a State Sensing Cost

In many practical sequential decision-making problems, tracking the state of the environment incurs a sensing/communication/computation cost. In these settings, the agent's interaction with its environment includes the additional component of deciding when to sense the state, in a manner that balances the value associated with optimal (state-specific) actions and the cost of sensing. We formulate this as an expected discounted cost Markov Decision Process (MDP), wherein the agent incurs an additional cost for sensing its next state, but has the option to take actions while remaining `blind' to the system state. We pose this problem as a classical discounted cost MDP with an expanded (countably infinite) state space. While computing the optimal policy for this MDP is intractable in general, we derive lower bounds on the optimal value function, which allow us to bound the suboptimality gap of any policy. We also propose a computationally efficient algorithm SPI, based on policy improvement, which in practice performs close to the optimal policy. Finally, we benchmark against the state-of-the-art via a numerical case study.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

2505.0328

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)

Kokhahi, Ahmad, Kurz, Mary

Collision avoidance and path finding in a robotic mobile fulfillment system using multi-objective meta-heuristics

The rapid growth of e-commerce in recent years has significantly transformed people's shopping habits [1]. Consumers increasingly favor online shopping over in-person purchases, leading to a substantial impact on product logistics, which plays a crucial role in customer satisfaction. In addition to product quality and other factors, the timely delivery of orders has become a key determinant of customer satisfaction. Picking and replenishment tasks are responsible for 65% of operating costs [2]. In a conventional manual order picking system, often referred to as a picker-to-parts system, pickers dedicate 70% of their working time to searching for items and traveling within the facility [3, 4].

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

2510.2565

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation (0.83)
Energy (0.70)
Information Technology > Services > e-Commerce Services (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.65)

Learning to Plan & Schedule with Reinforcement-Learned Bimanual Robot Skills

Wan, Weikang, Ramos, Fabio, Yang, Xuning, Garrett, Caelan

Long-horizon contact-rich bimanual manipulation presents a significant challenge, requiring complex coordination involving a mixture of parallel execution and sequential collaboration between arms. In this paper, we introduce a hierarchical framework that frames this challenge as an integrated skill planning & scheduling problem, going beyond purely sequential decision-making to support simultaneous skill invocation. Our approach is built upon a library of single-arm and bimanual primitive skills, each trained using Reinforcement Learning (RL) in GPU-accelerated simulation. We then train a Transformer-based planner on a dataset of skill compositions to act as a high-level scheduler, simultaneously predicting the discrete schedule of skills as well as their continuous parameters. We demonstrate that our method achieves higher success rates on complex, contact-rich tasks than end-to-end RL approaches and produces more efficient, coordinated behaviors than traditional sequential-only planners.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2510.25634

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Schmöcker, Robin, Dockhorn, Alexander, Rosenhahn, Bodo

Grouping Nodes With Known Value Differences: A Lossless UCT-based Abstraction Algorithm

A core challenge of Monte Carlo Tree Search (MCTS) is its sample efficiency, which can be improved by grouping state-action pairs and using their aggregate statistics instead of single-node statistics. On the Go Abstractions in Upper Confidence bounds applied to Trees (OGA-UCT) is the state-of-the-art MCTS abstraction algorithm for deterministic environments that builds its abstraction using the Abstractions of State-Action Pairs (ASAP) framework, which aims to detect states and state-action pairs with the same value under optimal play by analysing the search graph. ASAP, however, requires two state-action pairs to have the same immediate reward, which is a rigid condition that limits the number of abstractions that can be found and thereby the sample efficiency. In this paper, we break with the paradigm of grouping value-equivalent states or state-action pairs and instead group states and state-action pairs with possibly different values as long as the difference between their values can be inferred. We call this abstraction framework Known Value Difference Abstractions (KVDA), which infers the value differences by analysis of the immediate rewards and modifies OGA-UCT to use this framework instead. The modification is called KVDA-UCT, which detects significantly more abstractions than OGA-UCT, introduces no additional parameter, and outperforms OGA-UCT on a variety of deterministic environments and parameter settings.

abstraction, artificial intelligence, machine learning, (18 more...)

2510.25388

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Collaborative Scheduling of Time-dependent UAVs,Vehicles and Workers for Crowdsensing in Disaster Response

Han, Lei, Zhang, Jinhao, Liu, Jinhui, Yu, Zhiyong, Wang, Liang, Wang, Quan, Yu, Zhiwen

Frequent natural disasters cause significant losses to human society, and timely, efficient collection of post-disaster environmental information is the foundation for effective rescue operations. Due to the extreme complexity of post-disaster environments, existing sensing technologies such as mobile crowdsensing suffer from weak environmental adaptability, insufficient professional sensing capabilities, and poor practicality of sensing solutions. Therefore, this paper explores a heterogeneous multi-agent online collaborative scheduling algorithm, HoCs-MPQ, to achieve efficient collection of post-disaster environmental information. HoCs-MPQ models collaboration and conflict relationships among multiple elements through weighted undirected graph construction, and iteratively solves the maximum weight independent set based on multi-priority queues, ultimately achieving collaborative sensing scheduling of time-dependent UA Vs, vehicles, and workers. Specifically, (1) HoCs-MPQ constructs weighted undirected graph nodes based on collaborative relationships among multiple elements and quantifies their weights, then models the weighted undirected graph based on conflict relationships between nodes; (2) HoCs-MPQ solves the maximum weight independent set based on iterated local search, and accelerates the solution process using multi-priority queues. Finally, we conducted detailed experiments based on extensive real-world and simulated data. The experiments show that, compared to baseline methods (e.g., HoCs-GREEDY, HoCs-K-WTA, HoCs-MADL, and HoCs-MARL), HoCs-MPQ improves task completion rates by an average of 54.13%, 23.82%, 14.12%, and 12.89% respectively, with computation time for single online autonomous scheduling decisions not exceeding 3 seconds.

machine learning, reinforcement learning, uav, (21 more...)

2510.25212

Country: Asia > China (0.94)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.93)
Information Technology (0.93)
Energy (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(4 more...)

Andrei, Iancu, Kloetzer, Marius, Mahulea, Cristian, Dosoftei, Catalin

Smooth path planning with safety margins using Piece-Wise Bezier curves

In this paper, we propose a computationally efficient quadratic programming (QP) approach for generating smooth, $C^1$ continuous paths for mobile robots using piece-wise quadratic Bezier (PWB) curves. Our method explicitly incorporates safety margins within a structured optimization framework, balancing trajectory smoothness and robustness with manageable numerical complexity suitable for real-time and embedded applications. Comparative simulations demonstrate clear advantages over traditional piece-wise linear (PWL) path planning methods, showing reduced trajectory deviations, enhanced robustness, and improved overall path quality. These benefits are validated through simulations using a Pure-Pursuit controller in representative scenarios, highlighting the practical effectiveness and scalability of our approach for safe navigation.

artificial intelligence, optimization problem, planning & scheduling, (19 more...)

doi: 10.1109/ETFA65518.2025.11205537

2510.24972

Country:

Europe > Spain (0.14)
Europe > Romania (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)