Planning & Scheduling
A User Study on Contrastive Explanations for Multi-Effector Temporal Planning with Non-Stationary Costs
Liu, Xiaowei, McAreavey, Kevin, Liu, Weiru
In this paper, we adopt constrastive explanations within an end-user application for temporal planning of smart homes. In this application, users have requirements on the execution of appliance tasks, pay for energy according to dynamic energy tariffs, have access to high-capacity battery storage, and are able to sell energy to the grid. The concurrent scheduling of devices makes this a multi-effector planning problem, while the dynamic tariffs yield costs that are non-stationary (alternatively, costs that are stationary but depend on exogenous events). These characteristics are such that the planning problems are generally not supported by existing PDDL-based planners, so we instead design a custom domain-dependent planner that scales to reasonable appliance numbers and time horizons. We conduct a controlled user study with 128 participants using an online crowd-sourcing platform based on two user stories. Our results indicate that users provided with contrastive questions and explanations have higher levels of satisfaction, tend to gain improved understanding, and rate the helpfulness more favourably with the recommended AI schedule compared to those without access to these features.
Velocity Field: An Informative Traveling Cost Representation for Trajectory Planning
Xin, Ren, Cheng, Jie, Wang, Sheng, Liu, Ming
Trajectory planning involves generating a series of space points to be followed in the near future. However, due to the complex and uncertain nature of the driving environment, it is impractical for autonomous vehicles~(AVs) to exhaustively design planning rules for optimizing future trajectories. To address this issue, we propose a local map representation method called Velocity Field. This approach provides heading and velocity priors for trajectory planning tasks, simplifying the planning process in complex urban driving. The heading and velocity priors can be learned from demonstrations of human drivers using our proposed loss. Additionally, we developed an iterative sampling-based planner to train and compare the differences between local map representations. We investigated local map representation forms for planning performance on a real-world dataset. Compared to learned rasterized cost maps, our method demonstrated greater reliability and computational efficiency.
From Cognition to Precognition: A Future-Aware Framework for Social Navigation
Gong, Zeying, Hu, Tianshuai, Qiu, Ronghe, Liang, Junwei
To navigate safely and efficiently in crowded spaces, robots should not only perceive the current state of the environment but also anticipate future human movements. In this paper, we propose a reinforcement learning architecture, namely Falcon, to tackle socially-aware navigation by explicitly predicting human trajectories and penalizing actions that block future human paths. To facilitate realistic evaluation, we introduce a novel SocialNav benchmark containing two new datasets, Social-HM3D and Social-MP3D. This benchmark offers large-scale photo-realistic indoor scenes populated with a reasonable amount of human agents based on scene area size, incorporating natural human movements and trajectory patterns. We conduct a detailed experimental analysis with the state-of-the-art learning-based method and two classic rule-based path-planning algorithms on the new benchmark. The results demonstrate the importance of future prediction and our method achieves the best task success rate of 55% while maintaining about 90% personal space compliance. We will release our code and datasets. Videos of demonstrations can be viewed at https://zeying-gong.github.io/projects/falcon/ .
Vision Language Models Can Parse Floor Plan Maps
DeFazio, David, Mehta, Hrudayangam, Blackburn, Jeremy, Zhang, Shiqi
Vision language models (VLMs) can simultaneously reason about images and texts to tackle many tasks, from visual question answering to image captioning. This paper focuses on map parsing, a novel task that is unexplored within the VLM context and particularly useful to mobile robots. Map parsing requires understanding not only the labels but also the geometric configurations of a map, i.e., what areas are like and how they are connected. To evaluate the performance of VLMs on map parsing, we prompt VLMs with floorplan maps to generate task plans for complex indoor navigation. Our results demonstrate the remarkable capability of VLMs in map parsing, with a success rate of 0.96 in tasks requiring a sequence of nine navigation actions, e.g., approaching and going through doors. Other than intuitive observations, e.g., VLMs do better in smaller maps and simpler navigation tasks, there was a very interesting observation that its performance drops in large open areas. We provide practical suggestions to address such challenges as validated by our experimental results. Webpage: https://shorturl.at/OUkEY
A Value Based Parallel Update MCTS Method for Multi-Agent Cooperative Decision Making of Connected and Automated Vehicles
Han, Ye, Zhang, Lijun, Meng, Dejian, Hu, Xingyu, Weng, Songyu
To solve the problem of lateral and logitudinal joint decision-making of multi-vehicle cooperative driving for connected and automated vehicles (CAVs), this paper proposes a Monte Carlo tree search (MCTS) method with parallel update for multi-agent Markov game with limited horizon and time discounted setting. By analyzing the parallel actions in the multi-vehicle joint action space in the partial-steady-state traffic flow, the parallel update method can quickly exclude potential dangerous actions, thereby increasing the search depth without sacrificing the search breadth. The proposed method is tested in a large number of randomly generated traffic flow. The experiment results show that the algorithm has good robustness and better performance than the SOTA reinforcement learning algorithms and heuristic methods. The vehicle driving strategy using the proposed algorithm shows rationality beyond human drivers, and has advantages in traffic efficiency and safety in the coordinating zone.
Autonomous Navigation in Ice-Covered Waters with Learned Predictions on Ship-Ice Interactions
Zhong, Ninghan, Potenza, Alessandro, Smith, Stephen L.
Autonomous navigation in ice-covered waters poses significant challenges due to the frequent lack of viable collision-free trajectories. When complete obstacle avoidance is infeasible, it becomes imperative for the navigation strategy to minimize collisions. Additionally, the dynamic nature of ice, which moves in response to ship maneuvers, complicates the path planning process. To address these challenges, we propose a novel deep learning model to estimate the coarse dynamics of ice movements triggered by ship actions through occupancy estimation. To ensure real-time applicability, we propose a novel approach that caches intermediate prediction results and seamlessly integrates the predictive model into a graph search planner. We evaluate the proposed planner both in simulation and in a physical testbed against existing approaches and show that our planner significantly reduces collisions with ice when compared to the state-of-the-art. Codes and demos of this work are available at https://github.com/IvanIZ/predictive-asv-planner.
Mission Planning on Autonomous Avoidance for Spacecraft Confronting Orbital Debris
Xingwen, Chen, Tong, Wang, Jianbin, Qiu, Jianbo, Feng
This paper investigates the mission planning problem for spacecraft confronting orbital debris to achieve autonomous avoidance. Firstly, combined with the avoidance requirements, a closed-loop framework of autonomous avoidance for orbital debris is proposed. Under the established model of mission planning, a two-stage planning is proposed to coordinate the conflict between routine tasks and debris avoidance. During the planning for expansion, the temporal constraints for duration actions are handled by the ordering choices. Meanwhile, dynamic resource variables satisfying instantaneous numerical change and continuous linear change are reasoned in the execution of actions. Linear Programming (LP) can solve the bounds of variables in each state, which is used to check the consistency of the interactive constraints on duration and resource. Then, the temporal relaxed planning graph (TRPG) heuristics is rationally developed to guide the plan towards the goal. Finally, the simulation demonstrates that the proposed mission planning strategy can effectively achieve the autonomous debris avoidance of the spacecraft.
Synthesizing Evolving Symbolic Representations for Autonomous Systems
Sartor, Gabriele, Oddi, Angelo, Rasconi, Riccardo, Santucci, Vieri Giuliano, Meo, Rosa
Recently, AI systems have made remarkable progress in various tasks. Deep Reinforcement Learning(DRL) is an effective tool for agents to learn policies in low-level state spaces to solve highly complex tasks. Researchers have introduced Intrinsic Motivation(IM) to the RL mechanism, which simulates the agent's curiosity, encouraging agents to explore interesting areas of the environment. This new feature has proved vital in enabling agents to learn policies without being given specific goals. However, even though DRL intelligence emerges through a sub-symbolic model, there is still a need for a sort of abstraction to understand the knowledge collected by the agent. To this end, the classical planning formalism has been used in recent research to explicitly represent the knowledge an autonomous agent acquires and effectively reach extrinsic goals. Despite classical planning usually presents limited expressive capabilities, PPDDL demonstrated usefulness in reviewing the knowledge gathered by an autonomous system, making explicit causal correlations, and can be exploited to find a plan to reach any state the agent faces during its experience. This work presents a new architecture implementing an open-ended learning system able to synthesize from scratch its experience into a PPDDL representation and update it over time. Without a predefined set of goals and tasks, the system integrates intrinsic motivations to explore the environment in a self-directed way, exploiting the high-level knowledge acquired during its experience. The system explores the environment and iteratively: (a) discover options, (b) explore the environment using options, (c) abstract the knowledge collected and (d) plan. This paper proposes an alternative approach to implementing open-ended learning architectures exploiting low-level and high-level representations to extend its knowledge in a virtuous loop.
LMMCoDrive: Cooperative Driving with Large Multimodal Model
Liu, Haichao, Yao, Ruoyu, Huang, Zhenmin, Shen, Shaojie, Ma, Jun
To address the intricate challenges of decentralized cooperative scheduling and motion planning in Autonomous Mobility-on-Demand (AMoD) systems, this paper introduces LMMCoDrive, a novel cooperative driving framework that leverages a Large Multimodal Model (LMM) to enhance traffic efficiency in dynamic urban environments. This framework seamlessly integrates scheduling and motion planning processes to ensure the effective operation of Cooperative Autonomous Vehicles (CAVs). The spatial relationship between CAVs and passenger requests is abstracted into a Bird's-Eye View (BEV) to fully exploit the potential of the LMM. Besides, trajectories are cautiously refined for each CAV while ensuring collision avoidance through safety constraints. A decentralized optimization strategy, facilitated by the Alternating Direction Method of Multipliers (ADMM) within the LMM framework, is proposed to drive the graph evolution of CAVs. Simulation results demonstrate the pivotal role and significant impact of LMM in optimizing CAV scheduling and enhancing decentralized cooperative optimization process for each vehicle. This marks a substantial stride towards achieving practical, efficient, and safe AMoD systems that are poised to revolutionize urban transportation. The code is available at https://github.com/henryhcliu/LMMCoDrive.
Towards Explainable Goal Recognition Using Weight of Evidence (WoE): A Human-Centered Approach
Alshehri, Abeer, Abdulrahman, Amal, Alamri, Hajar, Miller, Tim, Vered, Mor
Goal recognition (GR) involves inferring an agent's unobserved goal from a sequence of observations. This is a critical problem in AI with diverse applications. Traditionally, GR has been addressed using 'inference to the best explanation' or abduction, where hypotheses about the agent's goals are generated as the most plausible explanations for observed behavior. Alternatively, some approaches enhance interpretability by ensuring that an agent's behavior aligns with an observer's expectations or by making the reasoning behind decisions more transparent. In this work, we tackle a different challenge: explaining the GR process in a way that is comprehensible to humans. We introduce and evaluate an explainable model for goal recognition (GR) agents, grounded in the theoretical framework and cognitive processes underlying human behavior explanation. Drawing on insights from two human-agent studies, we propose a conceptual framework for human-centered explanations of GR. Using this framework, we develop the eXplainable Goal Recognition (XGR) model, which generates explanations for both why and why not questions. We evaluate the model computationally across eight GR benchmarks and through three user studies. The first study assesses the efficiency of generating human-like explanations within the Sokoban game domain, the second examines perceived explainability in the same domain, and the third evaluates the model's effectiveness in aiding decision-making in illegal fishing detection. Results demonstrate that the XGR model significantly enhances user understanding, trust, and decision-making compared to baseline models, underscoring its potential to improve human-agent collaboration.