Goto

Collaborating Authors

 Planning & Scheduling


Generating Plans that Predict Themselves

arXiv.org Artificial Intelligence

Collaboration requires coordination, and we coordinate by anticipating our teammates' future actions and adapting to their plan. In some cases, our teammates' actions early on can give us a clear idea of what the remainder of their plan is, i.e. what action sequence we should expect. In others, they might leave us less confident, or even lead us to the wrong conclusion. Our goal is for robot actions to fall in the first category: we want to enable robots to select their actions in such a way that human collaborators can easily use them to correctly anticipate what will follow. While previous work has focused on finding initial plans that convey a set goal, here we focus on finding two portions of a plan such that the initial portion conveys the final one. We introduce $t$-\ACty{}: a measure that quantifies the accuracy and confidence with which human observers can predict the remaining robot plan from the overall task goal and the observed initial $t$ actions in the plan. We contribute a method for generating $t$-predictable plans: we search for a full plan that accomplishes the task, but in which the first $t$ actions make it as easy as possible to infer the remaining ones. The result is often different from the most efficient plan, in which the initial actions might leave a lot of ambiguity as to how the task will be completed. Through an online experiment and an in-person user study with physical robots, we find that our approach outperforms a traditional efficiency-based planner in objective and subjective collaboration metrics.


Learning to Search with MCTSnets

arXiv.org Machine Learning

Planning problems are among the most important and well-studied problems in artificial intelligence. They are most typically solved by tree search algorithms that simulate ahead into the future, evaluate future states, and back-up those evaluations to the root of a search tree. Among these algorithms, Monte-Carlo tree search (MCTS) is one of the most general, powerful and widely used. A typical implementation of MCTS uses cleverly designed rules, optimized to the particular characteristics of the domain. These rules control where the simulation traverses, what to evaluate in the states that are reached, and how to back-up those evaluations. In this paper we instead learn where, what and how to search. Our architecture, which we call an MCTSnet, incorporates simulation-based search inside a neural network, by expanding, evaluating and backing-up a vector embedding. The parameters of the network are trained end-to-end using gradient-based optimisation. When applied to small searches in the well known planning problem Sokoban, the learned search algorithm significantly outperformed MCTS baselines.


An Introduction to Monte Carlo Tree Search

@machinelearnbot

We recently witnessed one of the biggest game AI events in history – Alpha Go became the first computer program to beat the world champion in a game of Go. The publication can be found here. Different techniques from machine learning and tree search have been combined by developers from DeepMind to ...


Guaranteed Plans for Multi-Robot Systems via Optimization Modulo Theories

AAAI Conferences

Industries are on the brink of widely accepting a new paradigm for organizing production by having autonomous robots manage in-factory processes. This transition from static process chains towards more automation and autonomy poses new challenges in terms of, e.g., efficiency of production processes. The RoboCup Logistics League (RCLL) has been proposed as a realistic testbed to study the above mentioned problem at a manageable scale. In RCLL, teams of robots manage and optimize the material flow according to dynamic orders in a simplified factory environment. In particular, robots have to transport workpieces among several machines scattered around the factory shop floor. Each machine performs a specific processing step, orders that denote the products which must be assembled with these operations are posted at run-time and require quick planning and scheduling. Orders also come with a delivery time window, therefore introducing a temporal component into the problem. Though there exist successful heuristic approaches to solve the underlying planning and scheduling problems, a disadvantage of these methods is that they provide no guarantees about the quality of the solution. A promising solution to this problem is offered by the recently emerging field of Optimization Modulo Theories (OMT), where Satisfiability Modulo Theories (SMT) solving is extended with optimization functionalities. In this paper, we present an approach that combines bounded model checking and optimization to generate optimal controllers for multi-robot systems. In particular, using the RoboCup Logistics League as a testbed, we build formal models for robot motions, production processes, and for order schedules, deadlines and rewards. We then encode the synthesis problem as a linear mixed-integer problem and employ Optimization Modulo Theories to synthesize controllers with optimality guarantees.


Optimal Approximation of Random Variables for Estimating the Probability of Meeting a Plan Deadline

AAAI Conferences

In planning algorithms and in other domains, there is often a need to run long computations that involve summations, maximizations and other operations on random variables, and to store intermediate results. In this paper, as a main motivating example, we elaborate on the case of estimating probabilities of meeting deadlines in hierarchical plans. A source of computational complexity, often neglected in the analysis of such algorithms, is that the support of the variables needed as intermediate results may grow exponentially along the computation. Therefore, to avoid exponential memory and time complexities, we need to trim these variables. This is similar, in a sense, to rounding intermediate results in numerical computations. Of course, to maintain the quality of algorithms, the trimming procedure should be efficient and it must maintain accuracy as much as possible. In this paper, we propose an optimal trimming algorithm with polynomial time and memory complexities for the purpose of estimating probabilities of deadlines in plans. More specifically, we show that our algorithm, given the needed size of the representation of the variable, provides the best possible approximation, where approximation accuracy is considered with a measure that fits the goal of estimating deadline meeting probabilities.


Meta-Search Through the Space of Representations and Heuristics on a Problem by Problem Basis

AAAI Conferences

Two key aspects of problem solving are representation and search heuristics. Both theoretical and experimental studies have shown that there is no one best problem representation nor one best search heuristic. Therefore, some recent methods, e.g., portfolios, learn a good combination of problem solvers to be used in a given domain or set of domains. There are even dynamic portfolios that select a particular combination of problem solvers specific to a problem. These approaches: (1) need to perform a learning step; (2) do not usually focus on changing the representation of the input domain/problem; and (3) frequently do not adapt the portfolio to the specific problem. This paper describes a meta-reasoning system that searches through the space of combinations of representations and heuristics to find one suitable for optimally solving the specific problem. We show that this approach can be better than selecting a combination to use for all problems within a domain and is competitive with state of the art optimal planners.


Goal Recognition in Incomplete Domain Models

AAAI Conferences

Recent approaches to goal recognition have progressively relaxed the assumptions about the amount and correctness of domain knowledge and available observations, yielding accurate and efficient algorithms. These approaches, however, assume completeness and correctness of the domain theory against which their algorithms match observations: this is too strong for most real-world domains. In this work, we develop a goal recognition technique capable of recognizing goals using incomplete (and possibly incorrect) domain theories.


Plan-Based Intention Revision

AAAI Conferences

Plan-based story generation has operationalized concepts from the Belief-Desire-Intention (BDI) theory of mind to create goal-driven character agents with explainable behavior. However, these character agents are limited in that they do not capture the dynamic nature of intentions. To address this limitation, we define a plan-based intention revision model and propose an evaluation using the QUEST cognitive model to assess the explainability of an intention revision.


An Automated Employee Timetabling System for Small Businesses

AAAI Conferences

Employee scheduling is one of the most difficult challenges facing any small business owner. The problem becomes more complex when employees with different levels of seniority indicate preferences for specific roles in certain shifts and request flexible work hours outside of the standard eight-hour block. Many business owners and managers, who cannot afford (or choose not to use) commercially-available timetabling apps, spend numerous hours creating sub-optimal schedules by hand, leading to low staff morale. In this paper, we explain how two undergraduate students generalized the Nurse Scheduling Problem to take into account multiple roles and flexible work hours, and implemented a user-friendly automated timetabler based on a four-dimensional integer linear program. This system has been successfully deployed at two businesses in our community, each with 20+ employees: a coffee shop and a health clinic.


Guiding Search in Continuous State-Action Spaces by Learning an Action Sampler From Off-Target Search Experience

AAAI Conferences

In robotics, it is essential to be able to plan efficiently in high-dimensional continuous state-action spaces for long horizons. For such complex planning problems, unguided uniform sampling of actions until a path to a goal is found is hopelessly inefficient, and gradient-based approaches often fall short when the optimization manifold of a given problem is not smooth. In this paper, we present an approach that guides search in continuous spaces for generic planners by learning an action sampler from past search experience. We use a Generative Adversarial Network (GAN) to represent an action sampler, and address an important issue: search experience consists of a relatively large number of actions that are not on a solution path and a relatively small number of actions that actually are on a solution path. We introduce a new technique, based on an importance-ratio estimation method, for using samples from a non-target distribution to make GAN learning more data-efficient. We provide theoretical guarantees and empirical evaluation in three challenging continuous robot planning problems to illustrate the effectiveness of our algorithm.