In many robot motion planning problems such as manipulation planning for a personal robot in a kitchen or an industrial manipulator in a warehouse, all motion planning queries are in an environment that is largely static. Consequently, one should be able to improve the performance of a planning algorithm by training on this static environment ahead of operation time. In this work, we propose a method to improve the performance of heuristic search-based motion planners in such environments. The first, learning, phase of our proposed method analyzes search performance on multiple planning episodes to infer local minima zones, that is, regions where the existing heuristic(s) are weakly correlated with the true cost-to-go. Then, in the planning phase of the method, the learnt local minima are used to modify the original search graph in a way that improves search performance. We prove that our method preserves guarantees on completeness and bounded suboptimality with respect to the original search graph. Experimentally, we observe significant improvements in success rate and planning time for challenging 11 degree-of-freedom mobile manipulation problems.
If you've seen a robot manipulation demo, you've almost certainly noticed that the robot tends to spend a lot of time looking like it's not doing anything. It's tempting to say that the robot is "thinking" when this happens, and that might even be mostly correct: Odds are that you're waiting for some motion-planning algorithm to figure out how to get the robot's arm and gripper to do what it's supposed to do without running into anything. This motion-planning process is one of the most important skills a robot can have, and it's also one of the most time consuming. Researchers at Duke University, in Durham, N.C., have found a way to speed up motion planning by three orders of magnitude while using one-twentieth the power. Their solution is a custom processor that can perform the most time-consuming part of the job--checking for all potential collisions across the robot's entire range of motion--with unprecedented efficiency.
We address the problem of finding shortest paths in graphs where some edges have a prior probability of existence, and their existence can be verified during planning with time- consuming operations. Our work is motivated by real-world robot motion planning, where edge existence is often expensive to verify (typically involves time-consuming collision-checking between the robot and world models), but edge existence probabilities are readily available. The goal then, is to develop an anytime algorithm that can return good solutions quickly by somehow leveraging the existence probabilities, and continue to return better-quality solutions or provide tighter suboptimality bounds with more time. While our motivation is fast and high-quality motion planning for robots, this work presents two fundamental contributions applicable to generic graphs with probabilistic edges. They are: a) an algorithm for efficiently computing all relevant shortest paths in a graph with probabilistic edges, and as a by-product the expected shortest path cost, and b) an anytime algorithm for evaluating (verifying existence of) edges in a collection of paths, which is optimal in expectation under a chosen distribution of the algorithm interruption time. Finally, we provide a practical approach to integrate a) and b) in the context of robot motion planning and demonstrate significant improvements in success rate and planning time for a 11 degree-of-freedom mobile manipulation planning problem. We also conduct additional evaluations on a 2D grid navigation domain to study our algorithm's behavior.
We present a novel learning-based method for generating optimal motion plans for high-dimensional motion planning problems. In order to cope with the curse of dimensional- ity, our method proceeds in a fashion similar to block co- ordinate descent in finite-dimensional optimization: at each iteration, the motion is optimized over a lower dimensional subspace while leaving the path fixed along the other dimen- sions. Naive implementations of such an idea can produce vastly suboptimal results. In this work, we show how a prof- itable set of directions in which to perform this dimensional descent procedure can be learned efficiently. We provide suf- ficient conditions for global optimality of dimensional de- scent in this learned basis, based upon the low-dimensional structure of the planning cost function. We also show how this dimensional descent procedure can easily be used for problems that do not exhibit such structure with monotonic convergence. We illustrate the application of our method to high dimensional shape planning and arm trajectory planning problems.
Task-motion planning (TMP) addresses the problem of efficiently generating executable and low-cost task plans in a discrete space such that the (initially unknown) action costs are determined by motion plans in a corresponding continuous space. However, a task-motion plan can be sensitive to unexpected domain uncertainty and changes, leading to suboptimal behaviors or execution failures. In this paper, we propose a novel framework, TMP-RL, which is an integration of TMP and reinforcement learning (RL) from the execution experience, to solve the problem of robust task-motion planning in dynamic and uncertain domains. TMP-RL features two nested planning-learning loops. In the inner TMP loop, the robot generates a low-cost, feasible task-motion plan by iteratively planning in the discrete space and updating relevant action costs evaluated by the motion planner in continuous space. In the outer loop, the plan is executed, and the robot learns from the execution experience via model-free RL, to further improve its task-motion plans. RL in the outer loop is more accurate to the current domain but also more expensive, and using less costly task and motion planning leads to a jump-start for learning in the real world. Our approach is evaluated on a mobile service robot conducting navigation tasks in an office area. Results show that TMP-RL approach significantly improves adaptability and robustness (in comparison to TMP methods) and leads to rapid convergence (in comparison to task planning (TP)-RL methods). We also show that TMP-RL can reuse learned values to smoothly adapt to new scenarios during long-term deployments.