Plotting

 Thiebaux, Sylvie


Implementation and Comparison of Solution Methods for Decision Processes with Non-Markovian Rewards

arXiv.org Artificial Intelligence

This paper examines a number of solution methods for decision processes with non-Markovian rewards (NMRDPs). They all exploit a temporal logic specification of the reward function to automatically translate the NMRDP into an equivalent Markov decision process (MDP) amenable to well-known MDP solution methods. They differ however in the representation of the target MDP and the class of MDP solution methods to which they are suited. As a result, they adopt different temporal logics and different translations. Unfortunately, no implementation of these methods nor experimental let alone comparative results have ever been reported. This paper is the first step towards filling this gap. We describe an integrated system for solving NMRDPs which implements these methods and several variants under a common interface; we use it to compare the various approaches and identify the problem features favoring one over the other.


AI@NICTA

AI Magazine

NICTA is Australia's Information and Communications Technology (ICT) Centre of Excellence. It is the largest organization in Australia dedicated to ICT research. While it has close links with local universities, it is in fact an independent but not-for-profit company in the business of doing research, commercializing that research and training PhD students to do that research. Much of the work taking place at NICTA involves various topics in artificial intelligence. In this article, we survey some of the AI work being undertaken at NICTA.


Exploiting First-Order Regression in Inductive Policy Selection

arXiv.org Artificial Intelligence

We consider the problem of computing optimal generalised policies for relational Markov decision processes. We describe an approach combining some of the benefits of purely inductive techniques with those of symbolic dynamic programming methods. The latter reason about the optimal value function using first-order decision theoretic regression and formula rewriting, while the former, when provided with a suitable hypotheses language, are capable of generalising value functions or policies for small instances. Our idea is to use reasoning and in particular classical first-order regression to automatically generate a hypotheses language dedicated to the domain at hand, which is then used as input by an inductive solver. This approach avoids the more complex reasoning of symbolic dynamic programming while focusing the inductive solver's attention on concepts that are specifically relevant to the optimal value function for the domain considered.