Goto

Collaborating Authors

 Planning & Scheduling


Accelerating Exact Combinatorial Optimization via RL-based Initialization -- A Case Study in Scheduling

arXiv.org Artificial Intelligence

Scheduling on dataflow graphs (also known as computation graphs) is an NP-hard problem. The traditional exact methods are limited by runtime complexity, while reinforcement learning (RL) and heuristic-based approaches struggle with determinism and solution quality. This research aims to develop an innovative approach that employs machine learning (ML) for addressing combinatorial optimization problems, using scheduling as a case study. The goal is to provide guarantees in optimality and determinism while maintaining the runtime cost of heuristic methods. Specifically, we introduce a novel two-phase RL-to-ILP scheduling framework, which includes three steps: 1) RL solver acts as coarse-grain scheduler, 2) solution relaxation and 3) exact solving via ILP. Our framework demonstrates the same scheduling performance compared with using exact scheduling methods while achieving up to 128 $\times$ speed improvements. This was conducted on actual EdgeTPU platforms, utilizing ImageNet DNN computation graphs as input. Additionally, the framework offers improved on-chip inference runtime and acceleration compared to the commercially available EdgeTPU compiler.


Clothoid Curve-based Emergency-Stopping Path Planning with Adaptive Potential Field for Autonomous Vehicles

arXiv.org Artificial Intelligence

The Potential Field (PF)-based path planning method is widely adopted for autonomous vehicles (AVs) due to its real-time efficiency and simplicity. PF often creates a rigid road boundary, and while this ensures that the ego vehicle consistently operates within the confines of the road, it also brings a lurking peril in emergency scenarios. If nearby vehicles suddenly switch lanes, the AV has to veer off and brake to evade a collision, leading to the "blind alley" effect. In such a situation, the vehicle can become trapped or confused by the conflicting forces from the obstacle vehicle PF and road boundary PF, often resulting in indecision or erratic behavior, even crashes. To address the above-mentioned challenges, this research introduces an Emergency-Stopping Path Planning (ESPP) that incorporates an adaptive PF (APF) and a clothoid curve for urgent evasion. First, we design an emergency triggering estimation to detect the "blind alley" problem by analyzing the PF distribution. Second, we regionalize the driving scene to search the optimal breach point on the road PF and the final stopping point for the vehicle by considering the possible motion range of the obstacle. Finally, we use the optimized clothoid curve to fit these calculated points under vehicle dynamics constraints to generate a smooth emergency avoidance path. The proposed ESPP-based APF method was evaluated by conducting the co-simulation between MATLAB/Simulink and CarSim Simulator in a freeway scene. The simulation results reveal that the proposed method shows increased performance in emergency collision avoidance and renders the vehicle safer, in which the duration of wheel slip is 61.9% shorter, and the maximum steering angle amplitude is 76.9% lower than other potential field-based methods.


Fast Decision Support for Air Traffic Management at Urban Air Mobility Vertiports using Graph Learning

arXiv.org Artificial Intelligence

Urban Air Mobility (UAM) promises a new dimension to decongested, safe, and fast travel in urban and suburban hubs. These UAM aircraft are conceived to operate from small airports called vertiports each comprising multiple take-off/landing and battery-recharging spots. Since they might be situated in dense urban areas and need to handle many aircraft landings and take-offs each hour, managing this schedule in real-time becomes challenging for a traditional air-traffic controller but instead calls for an automated solution. This paper provides a novel approach to this problem of Urban Air Mobility - Vertiport Schedule Management (UAM-VSM), which leverages graph reinforcement learning to generate decision-support policies. Here the designated physical spots within the vertiport's airspace and the vehicles being managed are represented as two separate graphs, with feature extraction performed through a graph convolutional network (GCN). Extracted features are passed onto perceptron layers to decide actions such as continue to hover or cruise, continue idling or take-off, or land on an allocated vertiport spot. Performance is measured based on delays, safety (no. of collisions) and battery consumption. Through realistic simulations in AirSim applied to scaled down multi-rotor vehicles, our results demonstrate the suitability of using graph reinforcement learning to solve the UAM-VSM problem and its superiority to basic reinforcement learning (with graph embeddings) or random choice baselines.


Reactive Motion Generation on Learned Riemannian Manifolds

arXiv.org Artificial Intelligence

In recent decades, advancements in motion learning have enabled robots to acquire new skills and adapt to unseen conditions in both structured and unstructured environments. In practice, motion learning methods capture relevant patterns and adjust them to new conditions such as dynamic obstacle avoidance or variable targets. In this paper, we investigate the robot motion learning paradigm from a Riemannian manifold perspective. We argue that Riemannian manifolds may be learned via human demonstrations in which geodesics are natural motion skills. The geodesics are generated using a learned Riemannian metric produced by our novel variational autoencoder (VAE), which is especially intended to recover full-pose end-effector states and joint space configurations. In addition, we propose a technique for facilitating on-the-fly end-effector/multiple-limb obstacle avoidance by reshaping the learned manifold using an obstacle-aware ambient metric. The motion generated using these geodesics may naturally result in multiple-solution tasks that have not been explicitly demonstrated previously. We extensively tested our approach in task space and joint space scenarios using a 7-DoF robotic manipulator. We demonstrate that our method is capable of learning and generating motion skills based on complicated motion patterns demonstrated by a human operator. Additionally, we assess several obstacle avoidance strategies and generate trajectories in multiple-mode settings.


Modelling the Spread of COVID-19 in Indoor Spaces using Automated Probabilistic Planning

arXiv.org Artificial Intelligence

The coronavirus disease 2019 (COVID-19) pandemic has been ongoing for around 3 years, and has infected over 750 million people and caused over 6 million deaths worldwide at the time of writing. Throughout the pandemic, several strategies for controlling the spread of the disease have been debated by healthcare professionals, government authorities, and international bodies. To anticipate the potential impact of the disease, and to simulate the effectiveness of different mitigation strategies, a robust model of disease spread is needed. In this work, we explore a novel approach based on probabilistic planning and dynamic graph analysis to model the spread of COVID-19 in indoor spaces. We endow the planner with means to control the spread of the disease through non-pharmaceutical interventions (NPIs) such as mandating masks and vaccines, and we compare the impact of crowds and capacity limits on the spread of COVID-19 in these settings. We demonstrate that the use of probabilistic planning is effective in predicting the amount of infections that are likely to occur in shared spaces, and that automated planners have the potential to design competent interventions to limit the spread of the disease.


Planning to Learn: A Novel Algorithm for Active Learning during Model-Based Planning

arXiv.org Artificial Intelligence

Active Inference is a recent framework for modeling planning under uncertainty. Empirical and theoretical work have now begun to evaluate the strengths and weaknesses of this approach and how it might be improved. A recent extension - the sophisticated inference (SI) algorithm - improves performance on multi-step planning problems through recursive decision tree search. However, little work to date has been done to compare SI to other established planning algorithms. SI was also developed with a focus on inference as opposed to learning. The present paper has two aims. First, we compare performance of SI to Bayesian reinforcement learning (RL) schemes designed to solve similar problems. Second, we present an extension of SI - sophisticated learning (SL) - that more fully incorporates active learning during planning. SL maintains beliefs about how model parameters would change under the future observations expected under each policy. This allows a form of counterfactual retrospective inference in which the agent considers what could be learned from current or past observations given different future observations. To accomplish these aims, we make use of a novel, biologically inspired environment designed to highlight the problem structure for which SL offers a unique solution. Here, an agent must continually search for available (but changing) resources in the presence of competing affordances for information gain. Our simulations show that SL outperforms all other algorithms in this context - most notably, Bayes-adaptive RL and upper confidence bound algorithms, which aim to solve multi-step planning problems using similar principles (i.e., directed exploration and counterfactual reasoning). These results provide added support for the utility of Active Inference in solving this class of biologically-relevant problems and offer added tools for testing hypotheses about human cognition.


Neural-Network-Driven Method for Optimal Path Planning via High-Accuracy Region Prediction

arXiv.org Artificial Intelligence

Sampling-based path planning algorithms suffer from heavy reliance on uniform sampling, which accounts for unreliable and time-consuming performance, especially in complex environments. Recently, neural-network-driven methods predict regions as sampling domains to realize a non-uniform sampling and reduce calculation time. However, the accuracy of region prediction hinders further improvement. We propose a sampling-based algorithm, abbreviated to Region Prediction Neural Network RRT* (RPNN-RRT*), to rapidly obtain the optimal path based on a high-accuracy region prediction. First, we implement a region prediction neural network (RPNN), to predict accurate regions for the RPNN-RRT*. A full-layer channel-wise attention module is employed to enhance the feature fusion in the concatenation between the encoder and decoder. Moreover, a three-level hierarchy loss is designed to learn the pixel-wise, map-wise, and patch-wise features. A dataset, named Complex Environment Motion Planning, is established to test the performance in complex environments. Ablation studies and test results show that a high accuracy of 89.13% is achieved by the RPNN for region prediction, compared with other region prediction models. In addition, the RPNN-RRT* performs in different complex scenarios, demonstrating significant and reliable superiority in terms of the calculation time, sampling efficiency, and success rate for optimal path planning.


Formally-Sharp DAgger for MCTS: Lower-Latency Monte Carlo Tree Search using Data Aggregation with Formal Methods

arXiv.org Artificial Intelligence

We study how to efficiently combine formal methods, Monte Carlo Tree Search (MCTS), and deep learning in order to produce high-quality receding horizon policies in large Markov Decision processes (MDPs). In particular, we use model-checking techniques to guide the MCTS algorithm in order to generate offline samples of high-quality decisions on a representative set of states of the MDP. Those samples can then be used to train a neural network that imitates the policy used to generate them. This neural network can either be used as a guide on a lower-latency MCTS online search, or alternatively be used as a full-fledged policy when minimal latency is required. We use statistical model checking to detect when additional samples are needed and to focus those additional samples on configurations where the learnt neural network policy differs from the (computationally-expensive) offline policy. We illustrate the use of our method on MDPs that model the Frozen Lake and Pac-Man environments -- two popular benchmarks to evaluate reinforcement-learning algorithms.


Formal Modelling for Multi-Robot Systems Under Uncertainty

arXiv.org Artificial Intelligence

Purpose of Review: To effectively synthesise and analyse multi-robot behaviour, we require formal task-level models which accurately capture multi-robot execution. In this paper, we review modelling formalisms for multi-robot systems under uncertainty, and discuss how they can be used for planning, reinforcement learning, model checking, and simulation. Recent Findings: Recent work has investigated models which more accurately capture multi-robot execution by considering different forms of uncertainty, such as temporal uncertainty and partial observability, and modelling the effects of robot interactions on action execution. Other strands of work have presented approaches for reducing the size of multi-robot models to admit more efficient solution methods. This can be achieved by decoupling the robots under independence assumptions, or reasoning over higher level macro actions. Summary: Existing multi-robot models demonstrate a trade off between accurately capturing robot dependencies and uncertainty, and being small enough to tractably solve real world problems. Therefore, future research should exploit realistic assumptions over multi-robot behaviour to develop smaller models which retain accurate representations of uncertainty and robot interactions; and exploit the structure of multi-robot problems, such as factored state spaces, to develop scalable solution methods.


Graph-based View Motion Planning for Fruit Detection

arXiv.org Artificial Intelligence

Crop monitoring is crucial for maximizing agricultural productivity and efficiency. However, monitoring large and complex structures such as sweet pepper plants presents significant challenges, especially due to frequent occlusions of the fruits. Traditional next-best view planning can lead to unstructured and inefficient coverage of the crops. To address this, we propose a novel view motion planner that builds a graph network of viable view poses and trajectories between nearby poses, thereby considering robot motion constraints. The planner searches the graphs for view sequences with the highest accumulated information gain, allowing for efficient pepper plant monitoring while minimizing occlusions. The generated view poses aim at both sufficiently covering already detected and discovering new fruits. The graph and the corresponding best view pose sequence are computed with a limited horizon and are adaptively updated in fixed time intervals as the system gathers new information. We demonstrate the effectiveness of our approach through simulated and real-world experiments using a robotic arm equipped with an RGB-D camera and mounted on a trolley. As the experimental results show, our planner produces view pose sequences to systematically cover the crops and leads to increased fruit coverage when given a limited time in comparison to a state-of-the-art single next-best view planner.