Achieving safe, fully-automated control of a dynamic system requires fast, accurate responses to maintain safety while also driving the system toward its objectives. Two basic approaches to this problem include using prebuilt reactive plans and employing online planning. Ideally, a comprehensive set of plans could be built and scheduled offline, allowing the system to precompute and guarantee its response time to any runtime situation. However, a prebuilt set of reactive plans may not provide appropriate responses to all possible situations, particularly for complex problems in which domain knowledge may be either incomplete or imprecise. Conversely, online planning may be used to search for appropriate reactions to situations as they arise, but online deliberation must be bounded such that it terminates before the available resource limits are exceeded.
We propose a new decision-theoretic approach for solving execution-time deliberation scheduling problems using recent advances in Generalized Semi-Markov Decision Processes (GSMDPs). In particular, we use GSMDPs to more accurately model domains in which planning and execution occur concurrently, planimprovement actions have uncertain effects and duration, and events (such as threats) occur asynchronously and stochastically. We demonstrate a significant improvement in expressibility over previous discrete-time approximate models in which mission phase duration was fixed, failure events were synchronized with phase transitions, and planning time was discretized into constant-size planning quanta.
An individual planning agent does not generally have sufficient computational resources at its disposal to produce an optimal plan in a complex domain, as deliberation itself requires and consumes scarce resources. This problem is further exacerbated in a distributed planning context in which multiple, heterogeneous agents must expend a portion of their resource allotment on communication, negotiation, and shared planning activities with other cooperative agents. Because other agents can have different temporal grain sizes, planning horizons, deadlines, and access to distinct local information, the delays associated with local deliberation and, in turn, shared negotiation are asynchronous, unpredictable, and widely variable. We address this problem using a principled, decisiontheoretic approach based on recent advances in Generalized Semi-Markov Decision Processes (GSMDPs). In particular, we use GSMDPs to model agents who develop a continuous-time deliberation policy offline which can then be consulted to dynamically select both deliberation-level and domain-level actions at plan execution time. This scheme allows individual agents to model other cooperative agents' actions essentially as asynchronous events, e.g., that might or might not fulfill a request (uncertain effect) after a stochasticallydetermined delay (uncertain event duration). With this approach, the decision-theoretic planner for the individual agent can make near-optimal execution-time decisions that trade off the risks and opportunities associated with their own actions, other agents' actions, and asynchronous external threats.
School of Computer Science Carnegie Mellon University Pittsburgh, Pennsylvania 15213-3890 (412) 268-8102 Abstract Faced with a complicated task, some initial planning can significantly increase the likelihood of success and increase efficiency, but planning for too long before starting to act can reduce efficiency. This paper explores the question of when to begin acting for a resource bounded agent. Limitations of an idealized algorithm suggested in the literature are presented and illustrated in the context of a robot courier. A revised, idealized algorithm is given and justified. The revised idealized algorithm is used as a basis for developing a new "step choice" algorithm for making on-the-fly decisions for a simplified version of the robot courier task. A set of experiments are used to illustrate the relative advantage of the new strategy over always act, always compute and anytime algorithm based strategies for deciding when to begin execution. Introduction Faced with a complicated task, some initial planning can significantly increase the likelihood of success and increase efficiency. Conversely, sitting on one's hands, considering all possible consequences of all possible actions does not accomplish anything. At some point, action is called for. But when is this point reached?