Goto

Collaborating Authors

 Cserna, Bence


Improved Safe Real-time Heuristic Search

arXiv.org Artificial Intelligence

A fundamental concern in real-time planning is the presence of dead-ends in the state space, from which no goal is reachable. Recently, the SafeRTS algorithm was proposed for searching in such spaces. SafeRTS exploits a user-provided predicate to identify safe states, from which a goal is likely reachable, and attempts to maintain a backup plan for reaching a safe state at all times. In this paper, we study the SafeRTS approach, identify certain properties of its behavior, and design an improved framework for safe real-time search. We prove that the new approach performs at least as well as SafeRTS and present experimental results showing that its promise is fulfilled in practice.


Situated Planning for Execution Under Temporal Constraints

AAAI Conferences

One of the original motivations for domain-independent planning was to generate plans that would then be executed in the environment. However, most existing planners ignore the passage of time during planning. While this can work well when absolute time does not play a role, this approach can lead to plans failing when there are external timing constraints, such as deadlines. In this paper, we describe a new approach for time-sensitive temporal planning. Our planner is aware of the fact that plan execution will start only once planning finishes, and incorporates this information into its decision making, in order to focus the search on branches that are more likely to lead to plans that will be feasible when the planner finishes.


Avoiding Dead Ends in Real-Time Heuristic Search

AAAI Conferences

Many systems, such as mobile robots, need to be controlled in real time. Real-time heuristic search is a popular on-line planning paradigm that supports concurrent planning and execution. However,existing methods do not incorporate a notion of safety and we show that they can perform poorly in domains that contain dead-end states from which a goal cannot be reached. We introduce new real-time heuristic search methods that can guarantee safety if the domain obeys certain properties. We test these new methods on two different simulated domains that contain dead ends, one that obeys the properties and one that does not. We find that empirically the new methods provide good performance. We hope this work encourages further efforts to widen the applicability of real-time planning.


Value Directed Exploration in Multi-Armed Bandits with Structured Priors

arXiv.org Machine Learning

Multi-armed bandits are a quintessential machine learning problem requiring the balancing of exploration and exploitation. While there has been progress in developing algorithms with strong theoretical guarantees, there has been less focus on practical near-optimal finite-time performance. In this paper, we propose an algorithm for Bayesian multi-armed bandits that utilizes value-function-driven online planning techniques. Building on previous work on UCB and Gittins index, we introduce linearly-separable value functions that take both the expected return and the benefit of exploration into consideration to perform n-step lookahead. The algorithm enjoys a sub-linear performance guarantee and we present simulation results that confirm its strength in problems with structured priors. The simplicity and generality of our approach makes it a strong candidate for analyzing more complex multi-armed bandit problems.