Planning & Scheduling
A Survey of the Seventh International Planning Competition
In this article we review the 2011 International Planning Competition. We give an overview of the history of the competition, discussing how it has developed since its first edition in 1998. The 2011 competition was run in three main separate tracks: the deterministic (classical) track; the learning track; and the uncertainty track. Each track proposed its own distinct set of new challenges and the participants rose to these admirably, the results of each track showing promising progress in each area. The competition attracted a record number of participants this year, showing its continued and strong position as a major central pillar of the international planning research community.
A Survey of Research in Distributed, Continual Planning
Complex, real-world domains require rethinking traditional approaches to AI planning. Planning and executing the resulting plans in a dynamic environment implies a continual approach in which planning and execution are interleaved, uncertainty in the current and projected world state is recognized and handled appropriately, and replanning can be performed when the situation changes or planned actions fail. Furthermore, complex planning and execution problems may require multiple computational agents and human planners to collaborate on a solution. In this article, we describe a new paradigm for planning in complex, dynamic environments, which we term distributed, continual planning (DCP). We argue that developing DCP systems will be necessary for planning applications to be successful in these environments.
Robot: Mere Machine to Transcendent Mind
With regard to the first caveat, Moravec's estimates of animal equivalence are based solely on hardware complexity. It is often the case that hardware alone cannot deliver performance, but it also requires software sufficient to the task. Thus, projecting "human equivalence" in computer hardware does not mean that "human equivalent software" will be ready to run on the hardware. With regard to the second caveat, neuroscientists regularly adjust upward the estimate of human brain complexity. This upward estimation has the effect of extending the estimated time at which comput-Hans Moravec, Robot: Mere Machine to Transcendent Mind.
1173
We stare intensely at the robot with one eye, keeping the other one out for any surprises. It looks for the door and slowly starts moving into the room. Our minds seem to be sharing the same thought--" YODA, don't fail us now." We decided that the Office Navigation event in the robot competition was to be our first milestone in working toward this goal. It would provide us a context in which to direct our efforts.
A Predictive Model for Satisfying Conflicting Objectives in Scheduling Problems
The economic viability of a manufacturing organization depends on its ability to maximize customer services; maintain efficient, low-cost operations; and minimize total investment. These objectives conflict with one another and, thus, are difficult to achieve on an operational basis. Much of the work in the area of automated scheduling systems recognizes this problem but does not address it effectively. The work presented by this Ph.D. dissertation was motivated by the desire to generate good, costeffective schedules in dynamic and stochastic manufacturing environments (Berry 1991). Experimental analysis is used to illustrateโฆthe PCP approach within an advanced scheduling architecture.
A Planner Called R
STRIPS (as given in Nilsson [1998, pp. I now illustrate the planning algorithm with an example from the blocks world. In the version given at the Fifth International Conference on Artificial Intelligence Planning and Scheduling (AIPS'00) competition, there are four actions, given below in Although the planner returns a reasonable plan in the previous example, this might not be the case in general. For example, given the same initial situation description and the goal on(a, b), on(b, c), it returns the plan [unstack(c, b), putdown (c), unstack(b, a), putdown(b), pickup(a), stack(a, b), unstack(a, b), putdown(a), pickup(b), stack(b, c), pickup(a), stack(a,b)] which is obviously not a good one. For the AIPS'00 competition, our team implemented the following strategy for postprocessing: Remove all immediate cycles.
A New Technique Enables Dynamic Replanning and Rescheduling of Aeromedical Evacuation
We describe an application of a dynamic replanning technique in a highly dynamic and complex domain: the military aeromedical evacuation of patients to medical treatment facilities. U.S. Transportation Command (USTRANSCOM) is the U.S. Department of Defense (DoD) agency responsible for evacuating patients during wartime and peace. Doctrinally, patients requiring extended treatment must be evacuated by air to a suitable medical treatment facility. The Persian Gulf War was the first significant armed conflict in which this concept was put to a serious test. The results were far from satisfactory--about 60 percent of the patients ended up at the wrong destinations.
A Call for Knowledge-Based Planning
We are interested in solving real-world planning problems and, to that end, argue for the use of domain knowledge in planning. We believe that the field must develop methods capable of using rich knowledge models to make planning tools useful for complex problems. We discuss the suitability of current planning paradigms for solving these problems. Real-world problems have been found to require more expressive representations and capabilities than are needed for the standard set of benchmark planning problems (blocks world, towers of Hanoi, simplified logistics, and the like) or for the problems used in the 1998 and 2000 Artificial Intelligence Planning and Scheduling (AIPS) Conference planning competitions (Bacchus et al. 2000; Long 2000; McDermott 2000). Past research in AI planning can roughly be divided into two camps: (1) systems that take a minimalist approach to domain knowledge and (2) systems that focus on leveraging as much domain knowledge as possible.
Scalable Planning with Tensorflow for Hybrid Nonlinear Domains
Wu, Ga, Say, Buser, Sanner, Scott
Given recent deep learning results that demonstrate the ability to effectively optimize high-dimensional non-convex functions with gradient descent optimization on GPUs, we ask in this paper whether symbolic gradient optimization tools such as Tensorflow can be effective for planning in hybrid (mixed discrete and continuous) nonlinear domains with high dimensional state and action spaces? To this end, we demonstrate that hybrid planning with Tensorflow and RMSProp gradient descent is competitive with mixed integer linear program (MILP) based optimization on piecewise linear planning domains (where we can compute optimal solutions) and substantially outperforms state-of-the-art interior point methods for nonlinear planning domains. Furthermore, we remark that Tensorflow is highly scalable, converging to a strong plan on a large-scale concurrent domain with a total of 576,000 continuous action parameters distributed over a horizon of 96 time steps and 100 parallel instances in only 4 minutes. We provide a number of insights that clarify such strong performance including observations that despite long horizons, RMSProp avoids both the vanishing and exploding gradient problems. Together these results suggest a new frontier for highly scalable planning in nonlinear hybrid domains by leveraging GPUs and the power of recent advances in gradient descent with highly optimized toolkits like Tensorflow.
Active Exploration for Learning Symbolic Representations
Andersen, Garrett, Konidaris, George
We introduce an online active exploration algorithm for data-efficiently learning an abstract symbolic model of an environment. Our algorithm is divided into two parts: the first part quickly generates an intermediate Bayesian symbolic model from the data that the agent has collected so far, which the agent can then use along with the second part to guide its future exploration towards regions of the state space that the model is uncertain about. We show that our algorithm outperforms random and greedy exploration policies on two different computer game domains. The first domain is an Asteroids-inspired game with complex dynamics but basic logical structure. The second is the Treasure Game, with simpler dynamics but more complex logical structure.