Goto

Collaborating Authors

 Dearden, Richard


Model-Based Bayesian Exploration

arXiv.org Artificial Intelligence

Reinforcement learning systems are often concerned with balancing exploration of untested actions against exploitation of actions that are known to be good. The benefit of exploration can be estimated using the classical notion of Value of Information - the expected improvement in future decision quality arising from the information acquired by exploration. Estimating this quantity requires an assessment of the agent's uncertainty about its current value estimates for states. In this paper we investigate ways of representing and reasoning about this uncertainty in algorithms where the system attempts to learn a model of its environment. We explicitly represent uncertainty about the parameters of the model and build probability distributions over Q-values based on these. These distributions are used to compute a myopic approximation to the value of information for each action and hence to select the action that best balances exploration and exploitation.


Planning under Continuous Time and Resource Uncertainty: A Challenge for AI

arXiv.org Artificial Intelligence

We outline a class of problems, typical of Mars rover operations, that are problematic for current methods of planning under uncertainty. The existing methods fail because they suffer from one or more of the following limitations: 1) they rely on very simple models of actions and time, 2) they assume that uncertainty is manifested in discrete action outcomes, 3) they are only practical for very small problems. For many real world problems, these assumptions fail to hold. In particular, when planning the activities for a Mars rover, none of the above assumptions is valid: 1) actions can be concurrent and have differing durations, 2) there is uncertainty concerning action durations and consumption of continuous resources like power, and 3) typical daily plans involve on the order of a hundred actions. This class of problems may be of particular interest to the UAI community because both classical and decision-theoretic planning techniques may be useful in solving it. We describe the rover problem, discuss previous work on planning under uncertainty, and present a detailed, but very small, example illustrating some of the difficulties of finding good plans.


Dynamic Programming for Structured Continuous Markov Decision Problems

arXiv.org Artificial Intelligence

We describe an approach for exploiting structure in Markov Decision Processes with continuous state variables. At each step of the dynamic programming, the state space is dynamically partitioned into regions where the value function is the same throughout the region. We first describe the algorithm for piecewise constant representations. We then extend it to piecewise linear representations, using techniques from POMDPs to represent and reason about linear surfaces efficiently. We show that for complex, structured problems, our approach exploits the natural structure so that optimal solutions can be computed efficiently.


A Switching Planner for Combined Task and Observation Planning

AAAI Conferences

From an automated planning perspective the problem of practical mobile robot control in realistic environments poses many important and contrary challenges. On the one hand, the planning process must be lightweight, robust, and timely. Over the lifetime of the robot it must always respond quickly with new plans that accommodate exogenous events, changing objectives, and the underlying unpredictability of the environment. On the other hand, in order to promote efficient behaviours the planning process must perform computationally expensive reasoning about contingencies and possible revisions of subjective beliefs according to quantitatively modelled uncertainty in acting and sensing. Towards addressing these challenges, we develop a continual planning approach that switches between using a fast satisficing "classical" planner, to decide on the overall strategy, and decision-theoretic planning to solve small abstract subproblems where deeper consideration of the sensing model is both practical, and can significantly impact overall performance. We evaluate our approach in large problems from a realistic robot exploration domain.


Navigation Planning in Probabilistic Roadmaps with Uncertainty

AAAI Conferences

Probabilistic Roadmaps (PRM) are a commonly used class of algorithms for robot navigation tasks where obstacles are present in the environment. We examine the situation where the obstacle positions are not precisely known. A subset of the edges in the PRM graph may possibly intersect the obstacles, and as the robot traverses the graph it can make noisy observations of these uncertain edges to determine if it can traverse them or not. The problem is to traverse the graph from an initial vertex to a goal without taking a blocked edge, and to do this optimally the robot needs to consider the observations it can make as well as the structure of the graph. In this paper we show how this problem can be represented as a POMDP. We show that while too large to be solved with exact methods, approximate point based methods can provide a good quality solution. While feasible for smaller examples, this approach isn't scalable. By exploiting the structure in the belief space, we can construct an approximate belief-space MDP that can be solved efficiently using recent techniques in MDP planning. We then demonstrate that this gives near optimal results in most cases while achieving an order of magnitude speed-up in policy generation time.