Planning & Scheduling
Machine Cognition Models: EPAM and GPS
Through history, the human being tried to relay its daily tasks to other creatures, which was the main reason behind the rise of civilizations. It started with deploying animals to automate tasks in the field of agriculture(bulls), transportation (e.g. horses and donkeys), and even communication (pigeons). Millenniums after, come the Golden age with "Al-jazari" and other Muslim inventors, which were the pioneers of automation, this has given birth to industrial revolution in Europe, centuries after. At the end of the nineteenth century, a new era was to begin, the computational era, the most advanced technological and scientific development that is driving the mankind and the reason behind all the evolutions of science; such as medicine, communication, education, and physics. At this edge of technology engineers and scientists are trying to model a machine that behaves the same as they do, which pushed us to think about designing and implementing "Things that-Thinks", then artificial intelligence was. In this work we will cover each of the major discoveries and studies in the field of machine cognition, which are the "Elementary Perceiver and Memorizer"(EPAM) and "The General Problem Solver"(GPS). The First one focus mainly on implementing the human-verbal learning behavior, while the second one tries to model an architecture that is able to solve problems generally (e.g. theorem proving, chess playing, and arithmetic). We will cover the major goals and the main ideas of each model, as well as comparing their strengths and weaknesses, and finally giving their fields of applications. And Finally, we will suggest a real life implementation of a cognitive machine.
Proximity-Based Non-uniform Abstractions for Approximate Planning
Baum, J., Nicholson, A. E., Dix, T. I.
In a deterministic world, a planning agent can be certain of the consequences of its planned sequence of actions. Not so, however, in dynamic, stochastic domains where Markov decision processes are commonly used. Unfortunately these suffer from the `curse of dimensionality': if the state space is a Cartesian product of many small sets (`dimensions'), planning is exponential in the number of those dimensions. Our new technique exploits the intuitive strategy of selectively ignoring various dimensions in different parts of the state space. The resulting non-uniformity has strong implications, since the approximation is no longer Markovian, requiring the use of a modified planner. We also use a spatial and temporal proximity measure, which responds to continued planning as well as movement of the agent through the state space, to dynamically adapt the abstraction as planning progresses. We present qualitative and quantitative results across a range of experimental domains showing that an agent exploiting this novel approximation method successfully finds solutions to the planning problem using much less than the full state space. We assess and analyse the features of domains which our method can exploit.
Integration of Online Learning into HTN Planning for Robotic Tasks
Magnenat, Stรฉphane (ETH Zurich) | Chappelier, Jean-Cรฉdric (EPFL) | Mondada, Francesco (EPFL)
This paper extends hierarchical task network (HTN) planning with lightweight learning, considering that in robotics, actions have a non-zero probability of failing. Our work applies to A*-based HTN planners with lifting. We prove that the planner finds the plan of maximal expected utility, while retaining its lifting capability and efficient heuristic-based search. We show how to learn the probabilities online, which allows a robot to adapt by replanning on execution failures. The idea behind this work is to use the HTN domain to constrain the space of possibilities, and then to learn on the constrained space in a way requiring few training samples, rendering the method applicable to autonomous mobile robots.
Computing All-Pairs Shortest Paths by Leveraging Low Treewidth
Planken, L. R., de Weerdt, M. M., van der Krogt, R. P.J.
We present two new and efficient algorithms for computing all-pairs shortest paths. The algorithms operate on directed graphs with real (possibly negative) weights. They make use of directed path consistency along a vertex ordering d. Both algorithms run in O(n^2 w_d) time, where w_d is the graph width induced by this vertex ordering. For graphs of constant treewidth, this yields O(n^2) time, which is optimal. On chordal graphs, the algorithms run in O(nm) time. In addition, we present a variant that exploits graph separators to arrive at a run time of O(n w_d^2 + n^2 s_d) on general graphs, where s_d <= w_d is the size of the largest minimal separator induced by the vertex ordering d. We show empirically that on both constructed and realistic benchmarks, in many cases the algorithms outperform Floyd-Warshall's as well as Johnson's algorithm, which represent the current state of the art with a run time of O(n^3) and O(nm + n^2 log n), respectively. Our algorithms can be used for spatial and temporal reasoning, such as for the Simple Temporal Problem, which underlines their relevance to the planning and scheduling community.
Bayesian Inference in Monte-Carlo Tree Search
Tesauro, Gerald, Rajan, V T, Segal, Richard
Monte-Carlo Tree Search (MCTS) methods are drawing great interest after yielding breakthrough results in computer Go. This paper proposes a Bayesian approach to MCTS that is inspired by distributionfree approaches such as UCT [13], yet significantly differs in important respects. The Bayesian framework allows potentially much more accurate (Bayes-optimal) estimation of node values and node uncertainties from a limited number of simulation trials. We further propose propagating inference in the tree via fast analytic Gaussian approximation methods: this can make the overhead of Bayesian inference manageable in domains such as Go, while preserving high accuracy of expected-value estimates. We find substantial empirical outperformance of UCT in an idealized bandit-tree test environment, where we can obtain valuable insights by comparing with known ground truth. Additionally we rigorously prove on-policy and off-policy convergence of the proposed methods.
SAS+ Planning as Satisfiability
Huang, R., Chen, Y., Zhang, W.
Planning as satisfiability is a principal approach to planning with many eminent advantages. The existing planning as satisfiability techniques usually use encodings compiled from STRIPS. We introduce a novel SAT encoding scheme (SASE) based on the SAS+ formalism. The new scheme exploits the structural information in SAS+, resulting in an encoding that is both more compact and efficient for planning. We prove the correctness of the new encoding by establishing an isomorphism between the solution plans of SASE and that of STRIPS based encodings. We further analyze the transition variables newly introduced in SASE to explain why it accommodates modern SAT solving algorithms and improves performance. We give empirical statistical results to support our analysis. We also develop a number of techniques to further reduce the encoding size of SASE, and conduct experimental studies to show the strength of each individual technique. Finally, we report extensive experimental results to demonstrate significant improvements of SASE over the state-of-the-art STRIPS based encoding schemes in terms of both time and memory efficiency.
Composing Traveling Paths from Location-Based Services
Hsieh, Hsun-Ping (Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei, Taiwan) | Li, Cheng-Te (Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei, Taiwan)
With the emergence of location-based services, such as Foursquare and Gowalla, users are allowed to easily perform check-in actions anywhere and anytime. The location-based check-in not only enables personal geospatial journeys but also serves as a kind of fine-grained source for trip planning. In this work, we aim to collectively compose traveling paths by leveraging the check-in data through mining the moving behaviors of users. A novel system, TP-Comp, is developed. To compose travel paths, TP-Comp not only allows users to specify starting/end and/or must-go locations, but also provides the flexibility of the time constraint requirement (i.e., the expected duration of the trip). By considering a sequence of check-in points as a traveling path, we mine the frequent sequences with some ranking mechanism to achieve the goal. Our TP-Comp targets at travelers who are unfamiliar to the objective area/city and have time limitation in the trip.
Reasoning about RoboCup Soccer Narratives
Hajishirzi, Hannaneh, Hockenmaier, Julia, Mueller, Erik T., Amir, Eyal
This paper presents an approach for learning to translate simple narratives, i.e., texts (sequences of sentences) describing dynamic systems, into coherent sequences of events without the need for labeled training data. Our approach incorporates domain knowledge in the form of preconditions and effects of events, and we show that it outperforms state-of-the-art supervised learning systems on the task of reconstructing RoboCup soccer games from their commentaries.
Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search
Asmuth, John, Littman, Michael L.
Bayes-optimal behavior, while well-defined, is often difficult to achieve. Recent advances in the use of Monte-Carlo tree search (MCTS) have shown that it is possible to act near-optimally in Markov Decision Processes (MDPs) with very large or infinite state spaces. Bayes-optimal behavior in an unknown MDP is equivalent to optimal behavior in the known belief-space MDP, although the size of this belief-space MDP grows exponentially with the amount of history retained, and is potentially infinite. We show how an agent can use one particular MCTS algorithm, Forward Search Sparse Sampling (FSSS), in an efficient way to act nearly Bayes-optimally for all but a polynomial number of steps, assuming that FSSS can be used to act efficiently in any possible underlying MDP.
Robust Local Search for Solving RCPSP/max with Durational Uncertainty
Fu, N., Lau, H.C., Varakantham, P., Xiao, F.
Scheduling problems in manufacturing, logistics and project management have frequently been modeled using the framework of Resource Constrained Project Scheduling Problems with minimum and maximum time lags (RCPSP/max). Due to the importance of these problems, providing scalable solution schedules for RCPSP/max problems is a topic of extensive research. However, all existing methods for solving RCPSP/max assume that durations of activities are known with certainty, an assumption that does not hold in real world scheduling problems where unexpected external events such as manpower availability, weather changes, etc. lead to delays or advances in completion of activities. Thus, in this paper, our focus is on providing a scalable method for solving RCPSP/max problems with durational uncertainty. To that end, we introduce the robust local search method consisting of three key ideas: (a) Introducing and studying the properties of two decision rule approximations used to compute start times of activities with respect to dynamic realizations of the durational uncertainty; (b) Deriving the expression for robust makespan of an execution strategy based on decision rule approximations; and (c) A robust local search mechanism to efficiently compute activity execution strategies that are robust against durational uncertainty. Furthermore, we also provide enhancements to local search that exploit temporal dependencies between activities. Our experimental results illustrate that robust local search is able to provide robust execution strategies efficiently.