How does one deal with the unexpected? Our world is full of surprises and we humans are often able to correctly identify a problem and respond appropriately. Consider a new driver encountering their first traffic circle; a student experiencing a hard drive failure in the middle of an assignment; an unexpected question being asked during a job interview. In situations where we have a goal (i.e., reach a destination or submit a completed assignment), we may need to alter our original plan when the unexpected occurs. Could we enable autonomous artificial intelligent agents to do the same?
A long standing area of artificial intelligence is the field of automated planning. The traditional planning problem is to generate a sequence of actions given a concrete, specific goal (e.g., I will be home at dinnertime) and a set of specific actions (e.g., drive-car, fill-gas-tank, walk, etc). Generating plans that are hopefully efficient and optimal from start to finish under different circumstances (e.g., delayed effects) is an active area of research. After a plan has been generated, and during the execution of the plan, the environment may change. For example, a robot retrieving packages in a warehouse may discover it has dropped its package. Or perhaps another robot has broken down due to a hardware failure and is blocking the path of this robot. How can a robot (or any A.I. agent) know something unexpected has happened without knowing all possible future failures?
Fundamental research on autonomy aims to find general approaches to solve this problem. One approach is to generate expectations: facts that should be true during different stages of a plan's execution. When an expectation is violated, a discrepancy occurs between the expected and perceived facts. A new trend in autonomy is to include goal reasoning capabilities. In the event of a failure, the original goal may no longer be warranted. Perhaps robust autonomous agents need to generate and change their goals in response to a changing environment.
Autonomous systems still have a long way to go and open research questions on autonomous systems remain. Funding agencies consistently seek new research on autonomy for diverse operations ranging from cybersecurity to military and vehicular autonomy. What will autonomous systems be like in the future? Will we achieve autonomous agents that can handle any situation they encounter?
- Dustin Dannenhauer
In this article we review the 2011 International Planning Competition. We give an overview of the history of the competition, discussing how it has developed since its first edition in 1998. The 2011 competition was run in three main separate tracks: the deterministic (classical) track; the learning track; and the uncertainty track. Each track proposed its own distinct set of new challenges and the participants rose to these admirably, the results of each track showing promising progress in each area. The competition attracted a record number of participants this year, showing its continued and strong position as a major central pillar of the international planning research community.
Complex, real-world domains require rethinking traditional approaches to AI planning. Planning and executing the resulting plans in a dynamic environment implies a continual approach in which planning and execution are interleaved, uncertainty in the current and projected world state is recognized and handled appropriately, and replanning can be performed when the situation changes or planned actions fail. Furthermore, complex planning and execution problems may require multiple computational agents and human planners to collaborate on a solution. In this article, we describe a new paradigm for planning in complex, dynamic environments, which we term distributed, continual planning (DCP). We argue that developing DCP systems will be necessary for planning applications to be successful in these environments.
In this semi-tutorial paper, we first review the information-theoretic approach to account for the computational costs incurred during the search for optimal actions in a sequential decision-making problem. The traditional (MDP) framework ignores computational limitations while searching for optimal policies, essentially assuming that the acting agent is perfectly rational and aims for exact optimality. Using the free-energy, a variational principle is introduced that accounts not only for the value of a policy alone, but also considers the cost of finding this optimal policy. The solution of the variational equations arising from this formulation can be obtained using familiar Bellman-like value iterations from dynamic programming (DP) and the Blahut-Arimoto (BA) algorithm from rate distortion theory. Finally, we demonstrate the utility of the approach for generating hierarchies of state abstractions that can be used to best exploit the available computational resources. A numerical example showcases these concepts for a path-planning problem in a grid world environment.
In this work, we investigate a multi-period Home Health Care Scheduling Problem (HHCSP) under stochastic service and travel times. We first model the deterministic problem as an integer linear programming model that incorporates real-world requirements, such as time windows, continuity of care, workload fairness, inter-visit temporal dependencies. We then extend the model to cope with uncertainty in durations, by introducing chance constraints into the formulation. We propose efficient solution approaches, which provide quantifiable near-optimal solutions and further handle the uncertainties by employing a sampling-based strategy. We demonstrate the effectiveness of our proposed approaches on instances synthetically generated by real-world dataset for both deterministic and stochastic scenarios.
A critical challenge in temporal planning is robustly dealing with non-determinism introduced by the environment, e.g., the durational uncertainty of an action taken by a robot in the physical world due to slippage or other unexpected influences. Recent advances show that robustness, which accounts for uncertainty in predicting schedule success, is a better measure of solution quality than traditional metrics such as flexibility. This paper introduces the Robust Execution Problem (REP) for finding maximally robust dispatch strategies for general probabilistic temporal planning problems. While the REP is generally intractable in practice, we introduce approximate solution techniques—one that can be computed statically prior to the start of execution while providing robustness guarantees and one that dynamically adjusts to opportunities and setbacks during execution. We show empirically that dynamically optimizing for robustness improves the likelihood of execution success.
The cost-optimal track of the international planning competition in 2014 has seen an unexpected outcome. Different to the precursing competition in 2011, where explicit-state heuristic search planning scored best, advances in the state-set exploration with BDDs showed a significant lead. In this paper we review the outcome of the competition, briefly looking into the internals of the competing systems.
Interruptible pure exploration in multi-armed bandits (MABs) is a key component of Monte-Carlo tree search algorithms for sequential decision problems. We introduce Discriminative Bucketing (DB), a novel family of strategies for pure exploration in MABs, which allows for adapting recent advances in non-interruptible strategies to the interruptible setting, while guaranteeing exponential-rate performance improvement over time. Our experimental evaluation demonstrates that the corresponding instances of DB favorably compete both with the currently popular strategies UCB1 and Epsilon-Greedy, as well as with the conservative uniform sampling.
PDDL+ planning involves reasoning about mixed discrete-continuous change over time. Nearly all PDDL+ planners assume that continuous change is linear. We present a new technique that accommodates nonlinear change by encoding problems as nonlinear hybrid systems. Using this encoding, we apply a Satisfiability Modulo Theories (SMT) solver to find plans. We show that it is important to use novel planning- specific heuristics for variable and value selection for SMT solving, which is inspired by recent advances in planning as SAT. We show the promising performance of the resulting solver on challenging nonlinear problems.