Planning & Scheduling
CBAGAN-RRT: Convolutional Block Attention Generative Adversarial Network for Sampling-Based Path Planning
Sagar, Abhinav, Gilukara, Sai Teja
Sampling-based path planning algorithms play an important role in autonomous robotics. However, a common problem among the RRT-based algorithms is that the initial path generated is not optimal and the convergence is too slow to be used in real-world applications. In this paper, we propose a novel image-based learning algorithm (CBAGAN-RRT) using a Convolutional Block Attention Generative Adversarial Network with a combination of spatial and channel attention and a novel loss function to design the heuristics, find a better optimal path, and improve the convergence of the algorithm both concerning time and speed. The probability distribution of the paths generated from our GAN model is used to guide the sampling process for the RRT algorithm. We train and test our network on the dataset generated by \cite{zhang2021generative} and demonstrate that our algorithm outperforms the previous state-of-the-art algorithms using both the image quality generation metrics like IOU Score, Dice Score, FID score, and path planning metrics like time cost and the number of nodes. We conduct detailed experiments and ablation studies to illustrate the feasibility of our study and show that our model performs well not only on the training dataset but also on the unseen test dataset. The advantage of our approach is that we can avoid the complicated preprocessing in the state space, our model can be generalized to complicated environments like those containing turns and narrow passages without loss of accuracy, and our model can be easily integrated with other sampling-based path planning algorithms.
Safe motion planning with environment uncertainty
Thomas, Antony, Mastrogiovanni, Fulvio, Baglietto, Marco
We present an approach for safe motion planning under robot state and environment (obstacle and landmark location) uncertainties. To this end, we first develop an approach that accounts for the landmark uncertainties during robot localization. Existing planning approaches assume that the landmark locations are well known or are known with little uncertainty. However, this might not be true in practice. Noisy sensors and imperfect motions compound to the errors originating from the estimate of environment features. Moreover, possible occlusions and dynamic objects in the environment render imperfect landmark estimation. Consequently, not considering this uncertainty can wrongly localize the robot, leading to inefficient plans. Our approach thus incorporates the landmark uncertainty within the Bayes filter estimation framework. We also analyze the effect of considering this uncertainty and delineate the conditions under which it can be ignored. Second, we extend the state-of-the-art by computing an exact expression for the collision probability under Gaussian distributed robot motion, perception and obstacle location uncertainties. We formulate the collision probability process as a quadratic form in random variables. Under Gaussian distribution assumptions, an exact expression for collision probability is thus obtained which is computable in real-time. In contrast, existing approaches approximate the collision probability using upper-bounds that can lead to overly conservative estimate and thereby suboptimal plans. We demonstrate and evaluate our approach using a theoretical example and simulations. We also present a comparison of our approach to different state-of-the-art methods.
Waterberry Farms: A Novel Benchmark For Informative Path Planning
Matloob, Samuel, Datta, Partha P., Kreidl, O. Patrick, Dutta, Ayan, Roy, Swapnoneel, Bölöni, Ladislau
Recent developments in robotic and sensor hardware make data collection with mobile robots (ground or aerial) feasible and affordable to a wide population of users. The newly emergent applications, such as precision agriculture, weather damage assessment, or personal home security often do not satisfy the simplifying assumptions made by previous research: the explored areas have complex shapes and obstacles, multiple phenomena need to be sensed and estimated simultaneously and the measured quantities might change during observations. The future progress of path planning and estimation algorithms requires a new generation of benchmarks that provide representative environments and scoring methods that capture the demands of these applications. This paper describes the Waterberry Farms benchmark (WBF) that models a precision agriculture application at a Florida farm growing multiple crop types. The benchmark captures the dynamic nature of the spread of plant diseases and variations of soil humidity while the scoring system measures the performance of a given combination of a movement policy and an information model estimator. By benchmarking several examples of representative path planning and estimator algorithms, we demonstrate WBF's ability to provide insight into their properties and quantify future progress.
Multimodal Contextualized Plan Prediction for Embodied Task Completion
İnan, Mert, Padmakumar, Aishwarya, Gella, Spandana, Lange, Patrick, Hakkani-Tur, Dilek
Task planning is an important component of traditional robotics systems enabling robots to compose fine grained skills to perform more complex tasks. Recent work building systems for translating natural language to executable actions for task completion in simulated embodied agents is focused on directly predicting low level action sequences that would be expected to be directly executable by a physical robot. In this work, we instead focus on predicting a higher level plan representation for one such embodied task completion dataset - TEACh, under the assumption that techniques for high-level plan prediction from natural language are expected to be more transferable to physical robot systems. We demonstrate that better plans can be predicted using multimodal context, and that plan prediction and plan execution modules are likely dependent on each other and hence it may not be ideal to fully decouple them. Further, we benchmark execution of oracle plans to quantify the scope for improvement in plan prediction models.
IndoorSim-to-OutdoorReal: Learning to Navigate Outdoors without any Outdoor Experience
Truong, Joanne, Zitkovich, April, Chernova, Sonia, Batra, Dhruv, Zhang, Tingnan, Tan, Jie, Yu, Wenhao
We present IndoorSim-to-OutdoorReal (I2O), an end-to-end learned visual navigation approach, trained solely in simulated short-range indoor environments, and demonstrates zero-shot sim-to-real transfer to the outdoors for long-range navigation on the Spot robot. Our method uses zero real-world experience (indoor or outdoor), and requires the simulator to model no predominantly-outdoor phenomenon (sloped grounds, sidewalks, etc). The key to I2O transfer is in providing the robot with additional context of the environment (i.e., a satellite map, a rough sketch of a map by a human, etc.) to guide the robot's navigation in the real-world. The provided context-maps do not need to be accurate or complete -- real-world obstacles (e.g., trees, bushes, pedestrians, etc.) are not drawn on the map, and openings are not aligned with where they are in the real-world. Crucially, these inaccurate context-maps provide a hint to the robot about a route to take to the goal. We find that our method that leverages Context-Maps is able to successfully navigate hundreds of meters in novel environments, avoiding novel obstacles on its path, to a distant goal without a single collision or human intervention. In comparison, policies without the additional context fail completely. Lastly, we test the robustness of the Context-Map policy by adding varying degrees of noise to the map in simulation. We find that the Context-Map policy is surprisingly robust to noise in the provided context-map. In the presence of significantly inaccurate maps (corrupted with 50% noise, or entirely blank maps), the policy gracefully regresses to the behavior of a policy with no context. Videos are available at https://www.joannetruong.com/projects/i2o.html
SwipeBot: DNN-based Autonomous Robot Navigation among Movable Obstacles in Cluttered Environments
Zherdev, Nikolay, Kurenkov, Mikhail, Belikova, Kristina, Tsetserukou, Dzmitry
In this paper, we propose a novel approach to wheeled robot navigation through an environment with movable obstacles. A robot exploits knowledge about different obstacle classes and selects the minimally invasive action to perform to clear the path. We trained a convolutional neural network (CNN), so the robot can classify an RGB-D image and decide whether to push a blocking object and which force to apply. After known objects are segmented, they are being projected to a cost-map, and a robot calculates an optimal path to the goal. If the blocking objects are allowed to be moved, a robot drives through them while pushing them away. We implemented our algorithm in ROS, and an extensive set of simulations showed that the robot successfully overcomes the blocked regions. Our approach allows a robot to successfully build a path through regions, where it would have stuck with traditional path-planning techniques.
Reducing Onboard Processing Time for Path Planning in Dynamically Evolving Polygonal Maps
Shirwatkar, Aditya, Singh, Aman, Kiran, Jana Ravi
Autonomous agents face the challenge of coordinating multiple tasks (perception, motion planning, controller) which are computationally expensive on a single onboard computer. To utilize the onboard processing capacity optimally, it is imperative to arrive at computationally efficient algorithms for global path planning. In this work, it is attempted to reduce the processing time for global path planning in dynamically evolving polygonal maps. In dynamic environments, maps may not remain valid for long. Hence it is of utmost importance to obtain the shortest path quickly in an ever-changing environment. To address this, an existing rapid path-finding algorithm, the Minimal Construct was used. This algorithm discovers only a necessary portion of the Visibility Graph around obstacles and computes collision tests only for lines that seem heuristically promising. Simulations show that this algorithm finds shortest paths faster than traditional grid-based A* searches in most cases, resulting in smoother and shorter paths even in dynamic environments.
Anticipatory Planning: Improving Long-Lived Planning by Estimating Expected Cost of Future Tasks
Dhakal, Roshan, Talukder, Md Ridwan Hossain, Stein, Gregory J.
We consider a service robot in a household environment given a sequence of high-level tasks one at a time. Most existing task planners, lacking knowledge of what they may be asked to do next, solve each task in isolation and so may unwittingly introduce side effects that make subsequent tasks more costly. In order to reduce the overall cost of completing all tasks, we consider that the robot must anticipate the impact its actions could have on future tasks. Thus, we propose anticipatory planning: an approach in which estimates of the expected future cost, from a graph neural network, augment model-based task planning. Our approach guides the robot towards behaviors that encourage preparation and organization, reducing overall costs in long-lived planning scenarios. We evaluate our method on blockworld environments and show that our approach reduces the overall planning costs by 5% as compared to planning without anticipatory planning. Additionally, if given an opportunity to prepare the environment in advance (a special case of anticipatory planning), our planner improves overall cost by 11%.
Online Learning in Budget-Constrained Dynamic Colonel Blotto Games
Leon, Vincent, Etesami, S. Rasoul
In this paper, we study the strategic allocation of limited resources using a Colonel Blotto game (CBG) under a dynamic setting and analyze the problem using an online learning approach. In this model, one of the players is a learner who has limited troops to allocate over a finite time horizon, and the other player is an adversary. In each round, the learner plays a one-shot Colonel Blotto game with the adversary and strategically determines the allocation of troops among battlefields based on past observations. The adversary chooses its allocation action randomly from some fixed distribution that is unknown to the learner. The learner's objective is to minimize its regret, which is the difference between the cumulative reward of the best mixed strategy and the realized cumulative reward by following a learning algorithm while not violating the budget constraint. The learning in dynamic CBG is analyzed under the framework of combinatorial bandits and bandits with knapsacks. We first convert the budget-constrained dynamic CBG to a path planning problem on a directed graph. We then devise an efficient algorithm that combines a special combinatorial bandit algorithm for path planning problem and a bandits with knapsack algorithm to cope with the budget constraint. The theoretical analysis shows that the learner's regret is bounded by a term sublinear in time horizon and polynomial in other parameters. Finally, we justify our theoretical results by carrying out simulations for various scenarios.
Repeated Principal-Agent Games with Unobserved Agent Rewards and Perfect-Knowledge Agents
Dogan, Ilgin, Shen, Zuo-Jun Max, Aswani, Anil
System designers frequently use the idea of providing incentives to stakeholders as a powerful means of steering the stakeholders for their own benefit. Operations management includes many such examples, such as offering performance-based bonuses to ride-hailing drivers, providing monetary incentives to patients for medical adherence, quality-contingent bonus payments for workers in crowdsourcing platforms, and vertical collaboration between shippers and carriers in transportation planning. In many real-world settings, the problem of designing efficient incentives can be posed as a repeated principal-agent problem where a principal (i.e., system designer) designs sequential incentive policies to motivate an agent (i.e., stakeholder) to convey certain behaviors that eventually serve the goal of maximizing the principal's cumulative net reward. Typically, there is an element of information asymmetry in these systems which arises between the principal and the agent in the form of either adverse selection (i.e., hidden information) or moral hazard (i.e., hidden actions) (Bolton and Dewatripont 2004). For instance, in the context of employment incentives designed by an employer, the hidden information in an adverse selection setting could be the level of productivity of an employee whereas a hidden action in the moral hazard setting could be the total effort level of the employee. More generally, the hidden information in the adverse selection setting can be seen as an unknown "type" or "preferences" of the agent that directly affects the action chosen by the agent, which in turn determines both the agent's utility and the principal's reward. These situations require specification of the agent's private information and the distributional-knowledge that the principal has concerning that information. Existing literature on repeated principal-agent models mostly studies the moral hazard setting, with a more recent focus on the problem of estimating agent's unknown model parameters under hidden actions (e.g., Ho et al. 2016, Kaynar and Siddiq 2022). On the other hand, the adverse selection setting is mostly studied either for single-period static games (Navabi and Nayyar 2018, Chade and Swinkels 2019, Gottlieb and Moreira 2022) or else for the repeated dynamic games where restrictive assumptions are made on, for example, dimension of the agent's action space, utility Dogan et.