Planning & Scheduling: Overviews

Planning for Goal-Oriented Dialogue Systems Artificial Intelligence

Generating complex multi-turn goal-oriented dialogue agents is a difficult problem that has seen a considerable focus from many leaders in the tech industry, including IBM, Google, Amazon, and Microsoft. This is in large part due to the rapidly growing market demand for dialogue agents capable of goal-oriented behaviour. Due to the business process nature of these conversations, end-to-end machine learning systems are generally not a viable option, as the generated dialogue agents must be deployable and verifiable on behalf of the businesses authoring them. In this work, we propose a paradigm shift in the creation of goal-oriented complex dialogue systems that dramatically eliminates the need for a designer to manually specify a dialogue tree, which nearly all current systems have to resort to when the interaction pattern falls outside standard patterns such as slot filling. We propose a declarative representation of the dialogue agent to be processed by state-of-the-art planning technology. Our proposed approach covers all aspects of the process; from model solicitation to the execution of the generated plans/dialogue agents. Along the way, we introduce novel planning encodings for declarative dialogue synthesis, a variety of interfaces for working with the specification as a dialogue architect, and a robust executor for generalized contingent plans. We have created prototype implementations of all components, and in this paper, we further demonstrate the resulting system empirically.

Active Goal Recognition Artificial Intelligence

To coordinate with other systems, agents must be able to determine what the systems are currently doing and predict what they will be doing in the future---plan and goal recognition. There are many methods for plan and goal recognition, but they assume a passive observer that continually monitors the target system. Real-world domains, where information gathering has a cost (e.g., moving a camera or a robot, or time taken away from another task), will often require a more active observer. We propose to combine goal recognition with other observer tasks in order to obtain \emph{active goal recognition} (AGR). We discuss this problem and provide a model and preliminary experimental results for one form of this composite problem. As expected, the results show that optimal behavior in AGR problems balance information gathering with other actions (e.g., task completion) such as to achieve all tasks jointly and efficiently. We hope that our formulation opens the door for extensive further research on this interesting and realistic problem.

A Human-Centered Data-Driven Planner-Actor-Critic Architecture via Logic Programming Artificial Intelligence

Recent successes of Reinforcement Learning (RL) allow an agent to learn policies that surpass human experts but suffers from being time-hungry and data-hungry. By contrast, human learning is significantly faster because prior and general knowledge and multiple information resources are utilized. In this paper, we propose a Planner-Actor-Critic architecture for huMAN-centered planning and learning (PACMAN), where an agent uses its prior, high-level, deterministic symbolic knowledge to plan for goal-directed actions, and also integrates the Actor-Critic algorithm of RL to fine-tune its behavior towards both environmental rewards and human feedback. This work is the first unified framework where knowledge-based planning, RL, and human teaching jointly contribute to the policy learning of an agent. Our experiments demonstrate that PACMAN leads to a significant jump-start at the early stage of learning, converges rapidly and with small variance, and is robust to inconsistent, infrequent, and misleading feedback.

Allen's Interval Algebra Makes the Difference Artificial Intelligence

Allen's Interval Algebra constitutes a framework for reaso n-ing about temporal information in a qualitative manner. In p articular, it uses intervals, i.e., pairs of endpoints, on the timeline to represent entities corresponding to actions, events, or tasks, and bi nary relations such as precedes and overlaps to encode the possible configurations between those entities. Allen's calculus has found its way in m any academic and industrial applications that involve, most commo nly, planning and scheduling, temporal databases, and healthcare. I n this paper, we present a novel encoding of Interval Algebra using answer -set programming (ASP) extended by difference constraints, i.e., th e fragment abbreviated as ASP(DL), and demonstrate its performance vi a a preliminary experimental evaluation. Although our ASP encoding i s presented in the case of Allen's calculus for the sake of clarity, we sug gest that analogous encodings can be devised for other point-based ca lculi, too.

Task-assisted Motion Planning in Partially Observable Domains Artificial Intelligence

Antony Thomas and Sunny Amatya † and Fulvio Mastrogiovanni and Marco Baglietto Abstract -- We present an integrated T ask-Motion Planning framework for robot navigation in belief space. Autonomous robots operating in real world complex scenarios require planning in the discrete (task) space and the continuous (motion) space. T o this end, we propose a framework for integrating belief space reasoning within a hybrid task planner . The expressive power of PDDL combined with heuristic-driven semantic attachments performs the propagated and posterior belief estimates while planning. The underlying methodology for the development of the combined hybrid planner is discussed, providing suggestions for improvements and future work. I NTRODUCTION Autonomous robots operating in complex real world scenarios require different levels of planning to execute their tasks. High-level (task) planning helps break down a given set of tasks into a sequence of sub-tasks, actual execution of each of these sub-tasks would require low-level control actions to generate appropriate robot motions. In fact, the dependency between logical and geometrical aspects is pervasive in both task planning and execution. Hence, planning should be performed in the task-motion or the discrete-continuous space. In recent years, combining high-level task planning with low-level motion planning has been a subject of great interest among the Robotics and Artificial Intelligence (AI) community.

Monte-Carlo Tree Search for Simulation-based Strategy Analysis Artificial Intelligence

Games are often designed to shape player behavior in a desired way; however, it can be unclear how design decisions affect the space of behaviors in a game. Designers usually explore this space through human playtesting, which can be time-consuming and of limited effectiveness in exhausting the space of possible behaviors. In this paper, we propose the use of automated planning agents to simulate humans of varying skill levels to generate game playthroughs. Metrics can then be gathered from these playthroughs to evaluate the current game design and identify its potential flaws. We demonstrate this technique in two games: the popular word game Scrabble and a collectible card game of our own design named Cardonomicon. Using these case studies, we show how using simulated agents to model humans of varying skill levels allows us to extract metrics to describe game balance (in the case of Scrabble) and highlight potential design flaws (in the case of Cardonomicon).

Representation Learning for Classical Planning from Partially Observed Traces Artificial Intelligence

Specifying a complete domain model is time-consuming, which has been a bottleneck of AI planning technique application in many real-world scenarios. Most classical domain-model learning approaches output a domain model in the form of the declarative planning language, such as STRIPS or PDDL, and solve new planning instances by invoking an existing planner. However, planning in such a representation is sensitive to the accuracy of the learned domain model which probably cannot be used to solve real planning problems. In this paper, to represent domain models in a vectorization representation way, we propose a novel framework based on graph neural network (GNN) integrating model-free learning and model-based planning, called LP-GNN . By embedding propositions and actions in a graph, the latent relationship between them is explored to form a domain-specific heuristics. We evaluate our approach on five classical planning domains, comparing with the classical domain-model learner ARMS. The experimental results show that the domain models learned by our approach are much more effective on solving real planning problems.

Goal Recognition Design in Deterministic Environments

Journal of Artificial Intelligence Research

Goal recognition design (GRD) facilitates understanding the goals of acting agents through the analysis and redesign of goal recognition models, thus offering a solution for assessing and minimizing the maximal progress of any agent in the model before goal recognition is guaranteed. In a nutshell, given a model of a domain and a set of possible goals, a solution to a GRD problem determines (1) the extent to which actions performed by an agent within the model reveal the agent’s objective; and (2) how best to modify the model so that the objective of an agent can be detected as early as possible. This approach is relevant to any domain in which rapid goal recognition is essential and the model design can be controlled. Applications include intrusion detection, assisted cognition, computer games, and human-robot collaboration. A GRD problem has two components: the analyzed goal recognition setting, and a design model specifying the possible ways the environment in which agents act can be modified so as to facilitate recognition. This work formulates a general framework for GRD in deterministic and partially observable environments, and offers a toolbox of solutions for evaluating and optimizing model quality for various settings. For the purpose of evaluation we suggest the worst case distinctiveness (WCD) measure, which represents the maximal cost of a path an agent may follow before its goal can be inferred by a goal recognition system. We offer novel compilations to classical planning for calculating WCD in settings where agents are bounded-suboptimal. We then suggest methods for minimizing WCD by searching for an optimal redesign strategy within the space of possible modifications, and using pruning to increase efficiency. We support our approach with an empirical evaluation that measures WCD in a variety of GRD settings and tests the efficiency of our compilation-based methods for computing it. We also examine the effectiveness of reducing WCD via redesign and the performance gain brought about by our proposed pruning strategy.

Probabilistic Planning with Reduced Models

Journal of Artificial Intelligence Research

Reduced models are simplified versions of a given domain, designed to accelerate the planning process. Interest in reduced models has grown since the surprising success of determinization in the first international probabilistic planning competition, leading to the development of several enhanced determinization techniques. To address the drawbacks of previous determinization methods, we introduce a family of reduced models in which probabilistic outcomes are classified as one of two types: primary and exceptional. In each model that belongs to this family of reductions, primary outcomes can occur an unbounded number of times per trajectory, while exceptions can occur at most a finite number of times, specified by a parameter. Distinct reduced models are characterized by two parameters: the maximum number of primary outcomes per action, and the maximum number of occurrences of exceptions per trajectory. This family of reductions generalizes the well-known most-likely-outcome determinization approach, which includes one primary outcome per action and zero exceptional outcomes per plan. We present a framework to determine the benefits of planning with reduced models, and develop a continual planning approach that handles situations where the number of exceptions exceeds the specified bound during plan execution. Using this framework, we compare the performance of various reduced models and consider the challenge of generating good ones automatically. We show that each one of the dimensions---allowing more than one primary outcome or planning for some limited number of exceptions---could improve performance relative to standard determinization. The results place previous work on determinization in a broader context and lay the foundation for a systematic exploration of the space of model reductions.

Incremental Learning of Discrete Planning Domains from Continuous Perceptions Artificial Intelligence

We propose a framework for learning discrete deterministic planning domains. In this framework, an agent learns the domain by observing the action effects through continuous features that describe the state of the environment after the execution of each action. Besides, the agent learns its perception function, i.e., a probabilistic mapping between state variables and sensor data represented as a vector of continuous random variables called perception variables. We define an algorithm that updates the planning domain and the perception function by (i) introducing new states, either by extending the possible values of state variables, or by weakening their constraints; (ii) adapts the perception function to fit the observed data (iii) adapts the transition function on the basis of the executed actions and the effects observed via the perception function. The framework is able to deal with exogenous events that happen in the environment.