Planning, scheduling, and other applications of heuristic search often demand we tackle problems that are too large to solve optimally. In this paper, we address the problem of solving shortest-path problems as quickly as possible while guaranteeing that solution costs are bounded within a specified factor of optimal.
We consider the problem of online planning in a Markov decision process with discounted rewards for any given initial state. We consider the PAC sample complexity problem of computing, with probability $1-\delta$, an $\epsilon$-optimal action using the smallest possible number of calls to the generative model (which provides reward and next-state samples). We design an algorithm, called StOP (for Stochastic-Optimistic Planning), based on the optimism in the face of uncertainty" principle. StOP can be used in the general setting, requires only a generative model, and enjoys a complexity bound that only depends on the local structure of the MDP."
In this paper we present a planner, LPRPG-P, capable of reasoning with the non-temporal subset of PDDL 3 preferences. Our focus is on computation of relaxed plan based heuristics that effectively guide a planner towards good solutions satisfying preferences. We build on the planner LPRPG, a hybrid relaxed planning graph (RPG)--linear programming (LP) approach. We make extensions to the RPG to reason with propositional preferences, and to the LP to reason with numeric preferences. LPRPG-P is the first planner with direct guidance for numeric preference satisfaction, exploiting the strong numeric reasoning of the LP. We introduce an anytime search approach for use with our new heuristic, and present results showing that LPRPG-P extends the state of the art in domain-independent planning with preferences.
We consider the stochastic variant of the Canadian Traveler's Problem, a path planning problem where adverse weather can cause some roads to be untraversable. The agent does not initially know which roads can be used. However, it knows a probability distribution for the weather, and it can observe the status of roads incident to its location. The objective is to find a policy with low expected travel cost. We introduce and compare several algorithms for the stochastic CTP. Unlike the optimistic approach most commonly considered in the literature, the new approaches we propose take uncertainty into account explicitly. We show that this property enables them to generate policies of much higher quality than the optimistic one, both theoretically and experimentally.
Planning is a notoriously difficult computational problem of high worst-case complexity. Researchers have been investing significant efforts to develop heuristics or restrictions to make planning practically feasible. Case-based planning is a heuristic approach where one tries to reuse previous experience when solving similar problems in order to avoid some of the planning effort. Plan reuse may offer an interesting alternative to plan generation in some settings. We provide theoretical results that identify situations in which plan reuse is provably tractable. We perform our analysis in the framework of parameterized complexity, which supports a rigorous worst-case complexity analysis that takes structural properties of the input into account in terms of parameters. A central notion of parameterized complexity is fixed-parameter tractability which extends the classical notion of polynomial-time tractability by utilizing the effect of parameters. We draw a detailed map of the parameterized complexity landscape of several variants of problems that arise in the context of case-based planning. In particular, we consider the problem of reusing an existing plan, imposing various restrictions in terms of parameters, such as the number of steps that can be added to the existing plan to turn it into a solution of the planning instance at hand.