Goto

Collaborating Authors

 Xie, Fan


Deep Imitation Learning for Bimanual Robotic Manipulation

arXiv.org Artificial Intelligence

We present a deep imitation learning framework for robotic bimanual manipulation in a continuous state-action space. Imitation learning has been effectively utilized in mimicking bimanual manipulation movements, but generalizing the movement to objects in different locations has not been explored. We hypothesize that to precisely generalize the learned behavior relative to an object's location requires modeling relational information in the environment. To achieve this, we designed a method that (i) uses a multi-model framework to decomposes complex dynamics into elemental movement primitives, and (ii) parameterizes each primitive using a recurrent graph neural network to capture interactions. Our model is a deep, hierarchical, modular architecture with a high-level planner that learns to compose primitives sequentially and a low-level controller which integrates primitive dynamics modules and inverse kinematics control. We demonstrate the effectiveness using several simulated bimanual robotic manipulation tasks. Compared to models based on previous imitation learning studies, our model generalizes better and achieves higher success rates in the simulated tasks.


On the Completeness of Best-First Search Variants That Use Random Exploration

AAAI Conferences

While suboptimal best-first search algorithms like Greedy Best-First Search are frequently used when building automated planning systems, their greedy nature can make them susceptible to being easily misled by flawed heuristics. This weakness has motivated the development of best-first search variants like epsilon-greedy node selection, type-based exploration, and diverse best-first search, which all use random exploration to mitigate the impact of heuristic error. In this paper, we provide a theoretical justification for this increased robustness by formally analyzing how these algorithms behave on infinite graphs. In particular, we show that when using these approaches on any infinite graph, the probability of not finding a solution can be made arbitrarily small given enough time. This result is shown to hold for a class of algorithms that includes the three mentioned above, regardless of how misleading the heuristic is.


Adding Local Exploration to Greedy Best-First Search in Satisficing Planning

AAAI Conferences

Greedy Best-First Search (GBFS) is a powerful algorithm at the heart of many state of the art satisficing planners. One major weakness of GBFS is its behavior in so-called uninformative heuristic regions (UHRs) - parts of the search space in which no heuristic provides guidance towards states with improved heuristic values. This work analyzes the problem of UHRs in planning in detail, and proposes a two level search framework as a solution. In Greedy Best-First Search with Local Exploration (GBFS-LE), a local exploration is started from within a global GBFS whenever the search seems stuck in UHRs. Two different local exploration strategies are developed and evaluated experimentally: Local GBFS (LS) and Local Random Walk Search (LRW). The two new planners LAMA-LS and LAMA-LRW integrate these strategies into the GBFS component of LAMA-2011. Both are shown to yield clear improvements in terms of both coverage and search time on standard International Planning Competition benchmarks, especially for domains that are proven to have large or un- bounded UHRs.


Type-Based Exploration with Multiple Search Queues for Satisficing Planning

AAAI Conferences

Utilizing multiple queues in Greedy Best-First Search (GBFS) has been proven to be a very effective approach to satisficing planning. Successful techniques include extra queues based on Helpful Actions (or Preferred Operators), as well as using Multiple Heuristics. One weakness of all standard GBFS algorithms is their lack of exploration. All queues used in these methods work as priority queues sorted by heuristic values. Therefore, misleading heuristics, especially early in the search process, can cause the search to become ineffective. Type systems, as introduced for heuristic search by Lelis et al, are a development of ideas for exploration related to the classic stratified sampling approach. The current work introduces a search algorithm that utilizes type systems in a new way – for exploration within a GBFS multiqueue framework in satisficing planning. A careful case study shows the benefits of such exploration for overcoming deficiencies of the heuristic. The proposed new baseline algorithm Type-GBFS solves almost 200 more problems than baseline GBFS over all International Planning Competition problems. Type-LAMA, a new planner which integrates Type-GBFS into LAMA-2011, solves 36.8 more problems than LAMA-2011.


A Comparison of Knowledge-Based GBFS Enhancements and Knowledge-Free Exploration

AAAI Conferences

GBFS-based satisficing planners often augment their search with knowledge-based enhancements such as preferred operators and multiple heuristics. These techniques seek to improve planner performance by making the search more informed. In our work, we will focus on how these enhancements impact coverage and we will use a simple technique called epsilon-greedy node selection to demonstrate that planner coverage can also be improved by introducing knowledge-free random exploration into the search. We then revisit the existing knowledge-based enhancements so as to determine if the knowledge these enhancements employ is offering necessary guidance, or if the impact of this knowledge is to add exploration which can be achieved more simply using randomness. This investigation provides further evidence of the importance of preferred operators and shows that the knowledge added when using an additional heuristic is crucial in certain domains, while not being as effective as random exploration in others. Finally, we demonstrate that random exploration can also improve the coverage of LAMA, a planner which already employs multiple enhancements. This suggests that knowledge-based enhancements need to be compared to appropriate knowledge-free random baselines so as to ensure the importance of the knowledge being used.


Better Time Constrained Search via Randomization and Postprocessing

AAAI Conferences

Most of the satisficing planners which are based on heuristic search iteratively improve their solution quality through an anytime approach. Typically, the lowest-cost solution found so far is used to constrain the search. This avoids areas of the state space which cannot directly lead to lower cost solutions. However, in this paper we show that when used in conjunction with a post-processing plan improvement system such as ARAS, this bounding approach can harm a planner’s performance since the bound may prevent the search from ever finding additional plans for the post-processor to improve. The new anytime search framework of Diverse Any-Time Search addresses this issue through the use of restarts, randomization, and by not bounding as strictly as is done by previous approaches. Below, we will show that by using these techniques, the framework is able to generate a more diverse set of “raw" input plans for the post-processor to work on. We then show that when adding both Diverse Any-Time Search and the ARAS post-processor to LAMA-2011, the winner of the most recent IPC planning competition, the performance according to the IPC scoring metric improves from 511 points to over 570 points when tested on the 550 problems from IPC 2008 and IPC 2011. Performance gains are also seen when these techniques are added to Anytime Explicit Estimation Algorithm (AEES), as the performance improves from 440 points to over 513 points on the same problem set.


A Local Monte Carlo Tree Search Approach in Deterministic Planning

AAAI Conferences

Much recent work in satisficing planning has aimed at striking a balance between coverage - solving as many problems as possible - and plan quality. Current planners achieve near perfect coverage on the latest IPC benchmarks. It is therefore natural to investigate their scaling behavior on more difficult instances. Among state of the art planners, LAMA (Richter, Helmert, and Westphal 2008) is able to generate high quality plans, but its coverage drops off rapidly with increasing prob- lem complexity. The Arvand planner (Nakhost and Müller 2009) scales to much harder instances but generates lower quality plans. This paper introduces a new algorithm, Monte Carlo Random Walk-based Local Tree Search (MRW-LTS), which uses random walks to selectively build local search trees. Experiments demonstrate that MRW-LTS combines a scaling behavior that is better than LAMA’s with a plan quality that is better than Arvand’s.