In this paper we present a planner, LPRPG-P, capable of reasoning with the non-temporal subset of PDDL 3 preferences. Our focus is on computation of relaxed plan based heuristics that effectively guide a planner towards good solutions satisfying preferences. We build on the planner LPRPG, a hybrid relaxed planning graph (RPG)--linear programming (LP) approach. We make extensions to the RPG to reason with propositional preferences, and to the LP to reason with numeric preferences. LPRPG-P is the first planner with direct guidance for numeric preference satisfaction, exploiting the strong numeric reasoning of the LP. We introduce an anytime search approach for use with our new heuristic, and present results showing that LPRPG-P extends the state of the art in domain-independent planning with preferences.
Planning, scheduling, and other applications of heuristic search often demand we tackle problems that are too large to solve optimally. In this paper, we address the problem of solving shortest-path problems as quickly as possible while guaranteeing that solution costs are bounded within a specified factor of optimal.
We consider the problem of online planning in a Markov decision process with discounted rewards for any given initial state. We consider the PAC sample complexity problem of computing, with probability $1-\delta$, an $\epsilon$-optimal action using the smallest possible number of calls to the generative model (which provides reward and next-state samples). We design an algorithm, called StOP (for Stochastic-Optimistic Planning), based on the optimism in the face of uncertainty" principle. StOP can be used in the general setting, requires only a generative model, and enjoys a complexity bound that only depends on the local structure of the MDP."
We consider the stochastic variant of the Canadian Traveler's Problem, a path planning problem where adverse weather can cause some roads to be untraversable. The agent does not initially know which roads can be used. However, it knows a probability distribution for the weather, and it can observe the status of roads incident to its location. The objective is to find a policy with low expected travel cost. We introduce and compare several algorithms for the stochastic CTP. Unlike the optimistic approach most commonly considered in the literature, the new approaches we propose take uncertainty into account explicitly. We show that this property enables them to generate policies of much higher quality than the optimistic one, both theoretically and experimentally.
Wilson et al. propose a measure of flexibility in project scheduling problems and propose several ways of distributing flexibility over tasks without overrunning the deadline. These schedules prove quite robust: delays of some tasks do not necessarily lead to delays of subsequent tasks. The number of tasks that finish late depends, among others, on the way of distributing flexibility. In this paper I study the different flexibility distributions proposed by Wilson et al. and the differences in number of violations (tasks that finish too late). I show one factor in the instances that causes differences in the number of violations, as well as two properties of the flexibility distribution that cause them to behave differently. Based on these findings, I propose three new flexibility distributions. Depending on the nature of the delays, these new flexibility distributions perform as good as or better than the distributions by Wilson et al.