Goto

Collaborating Authors

 Planning & Scheduling


Better Be Lucky than Good: Exceeding Expectations in MDP Evaluation

AAAI Conferences

Two other algorithms require the knowledge Markov Decision Processes (MDPs) offer a general framework of the optimal policy and its expected reward. We show to describe probabilistic planning problems of varying that the expected reward of the optimal policy is a lower complexity. The development of algorithms that act successfully bound for the expected performance of both strategies. in MDPs is important to many AI applications. Our final algorithm switches between the application of Since it is often impossible or intractable to evaluate MDP the optimal policy and the policy with the highest possible algorithms based on a theoretical analysis alone, the International outcome, which can be computed without notable overhead Probabilistic Planning Competition (IPPC) was introduced in the Trial-based Heuristic Tree Search (THTS) framework to allow a comparison based on experimental evaluation. (Keller and Helmert 2013). We show theoretically and empirically The idea is to approximate the quality of an MDP that all algorithms outperform the naïve base approach solver by performing a sequence of runs on a problem instance, that ignores the potential of optimizing evaluation and by using the average of the obtained results as runs in hindsight, and that it pays off to take suboptimal base an approximation of the expected reward.


tBurton: A Divide and Conquer Temporal Planner

AAAI Conferences

Planning for and controlling a network of interacting devices requires a planner that accounts for the automatic timed transitions of devices, while meeting deadlines and achieving durative goals. Consider a planner for an imaging satellite with a camera that cannot tolerate exhaust. The planner would need to determine that opening a valve causes a chain reaction that ignites the engine, and thus needs to shield the camera. While planners exist that support deadlines and durative goals, currently, no planners can handle automatic timed transitions. We present tBurton, a temporal planner that supports these features, while additionally producing a temporally least-commitment plan. tBurton uses a divide and conquer approach: dividing the problem using causal-graph decomposition and conquering each factor with heuristic forward search. The `sub-plans' from each factor are then unified in a conflict directed search, guided by the causal graph structure. We describe why this approach is fast and efficient, and demonstrate its ability to improve the performance of existing planners on factorable problems through benchmarks from the International Planning Competition.


Tractability of Planning with Loops

AAAI Conferences

We create a unified framework for analyzing and synthesizing plans with loops for solving problems with non-deterministic numeric effects and a limited form of partial observability. Three different action models---with deterministic, qualitative non-deterministic and Boolean non-deterministic semantics---are handled using a single abstract representation. We establish the conditions under which the correctness and termination of solutions, represented as abstract policies, can be verified. We also examine the feasibility of learning abstract policies from examples. We demonstrate our techniques on several planning problems and show that they apply to challenging real-world tasks such as doing the laundry with a PR2 robot. These results resolve a number of open questions about planning with loops and facilitate the development of new algorithms and applications.


Improving Exploration in UCT Using Local Manifolds

AAAI Conferences

Monte-Carlo planning has been proven successful in manysequential decision-making settings, but it suffers from poorexploration when the rewards are sparse. In this paper, weimprove exploration in UCT by generalizing across similarstates using a given distance metric. We show that this algorithm,like UCT, converges asymptotically to the optimalaction. When the state space does not have a natural distancemetric, we show how we can learn a local manifold from thetransition graph of states in the near future. to obtain a distancemetric. On domains inspired by video games, empiricalevidence shows that our algorithm is more sample efficientthan UCT, particularly when rewards are sparse.


Factored Symmetries for Merge-and-Shrink Abstractions

AAAI Conferences

Merge-and-shrink heuristics crucially rely on effective reduction techniques, such as bisimulation-based shrinking, to avoid the combinatorial explosion of abstractions. We propose the concept of factored symmetries for merge-and-shrink abstractions based on the established concept of symmetry reduction for state-space search. We investigate under which conditions factored symmetry reduction yields perfect heuristics and discuss the relationship to bisimulation. We also devise practical merging strategies based on this concept and experimentally validate their utility.


Heuristics and Symmetries in Classical Planning

AAAI Conferences

Heuristic search is a state-of-the-art approach to classical planning. Several heuristic families were developed over the years to automatically estimate goal distance information from problem descriptions. Orthogonally to the development of better heuristics, recent years have seen an increasing interest in symmetry-based state space pruning techniques that aim at reducing the search effort. However, little work has dealt with how the heuristics behave under symmetries. We investigate the symmetry properties of existing heuristics and reveal that many of them are invariant under symmetries.


Automatic Configuration of Sequential Planning Portfolios

AAAI Conferences

Sequential planning portfolios exploit the complementary strengths of different planners. Similarly, automated algorithm configuration tools can customize parameterized planning algorithms for a given type of tasks. Although some work has been done towards combining portfolios and algorithm configuration, the problem of automatically generating a sequential planning portfolio from a parameterized planner for a given type of tasks is still largely unsolved. Here, we present Cedalion, a conceptually simple approach for this problem that greedily searches for the pair of parameter configuration and runtime which, when appended to the current portfolio, maximizes portfolio improvement per additional runtime spent. We show theoretically that Cedalion yields portfolios provably within a constant factor of optimal for the training set distribution. We evaluate Cedalion empirically by applying it to construct sequential planning portfolios based on component planners from the highly parameterized Fast Downward (FD) framework. Results for a broad range of planning settings demonstrate that -- without any knowledge of planning or FD -- Cedalion constructs sequential FD portfolios that rival, and in some cases substantially outperform, manually-built FD portfolios.


From Non-Negative to General Operator Cost Partitioning

AAAI Conferences

Operator cost partitioning is a well-known technique to make admissible heuristics additive by distributing the operator costs among individual heuristics. Planning tasks are usually defined with non-negative operator costs and therefore it appears natural to demand the same for the distributed costs. We argue that this requirement is not necessary and demonstrate the benefit of using general cost partitioning. We show that LP heuristics for operator-counting constraints are cost-partitioned heuristics and that the state equation heuristic computes a cost partitioning over atomic projections. We also introduce a new family of potential heuristics and show their relationship to general cost partitioning.


Planning Over Multi-Agent Epistemic States: A Classical Planning Approach

AAAI Conferences

Many AI applications involve the interaction of multiple autonomous agents, requiring those agents to reason about their own beliefs, as well as those of other agents. However, planning involving nested beliefs is known to be computationally challenging. In this work, we address the task of synthesizing plans that necessitate reasoning about the beliefs of other agents. We plan from the perspective of a single agent with the potential for goals and actions that involve nested beliefs, non-homogeneous agents, co-present observations, and the ability for one agent to reason as if it were another. We formally characterize our notion of planning with nested belief, and subsequently demonstrate how to automatically convert such problems into problems that appeal to classical planning technology. Our approach represents an important first step towards applying the well-established field of automated planning to the challenging task of planning involving nested beliefs of multiple agents.


Goal Recognition Design for Non-Optimal Agents

AAAI Conferences

Goal recognition design involves the offline analysis of goal recognition models by formulating measures that assess the ability to perform goal recognition within a model and finding efficient ways to compute and optimize them. In this work we present goal recognition design for non-optimal agents, which extends previous work by accounting for agents that behave non-optimally either intentionally or naıvely. The analysis we present includes a new generalized model for goal recognition design and the worst case distinctiveness (wcd) measure. For two special cases of sub-optimal agents we present methods for calculating the wcd, part of which are based on novel compilations to classical planning problems. Our empirical evaluation shows the proposed solutions to be effective in computing and optimizing the wcd.