control knowledge
The notion of role in conceptual modelling
Reynaud, Chantal, Aussenac-Gilles, Nathalie, Tchounikine, Pierre, Trichet, Franckie
First of all, we present how the relationship between problem solving methods and domain models is tackled in different approaches. We concentrate on how they cope with this issue in the knowledge engineering process. Secondly, we introduce several properties which can be used to analyse, characterise and define the notion of role. We evaluate and compare the works exposed previously following these dimensions. This analysis suggests some developments to better exploit the relationship between reasoning and domain knowledge.
Optimizing Discrete Spaces via Expensive Evaluations: A Learning to Search Framework
Deshwal, Aryan, Belakaria, Syrine, Doppa, Janardhan Rao, Fern, Alan
We consider the problem of optimizing expensive black-box functions over discrete spaces (e.g., sets, sequences, graphs). The key challenge is to select a sequence of combinatorial structures to evaluate, in order to identify high-performing structures as quickly as possible. Our main contribution is to introduce and evaluate a new learning-to-search framework for this problem called L2S-DISCO. The key insight is to employ search procedures guided by control knowledge at each step to select the next structure and to improve the control knowledge as new function evaluations are observed. We provide a concrete instantiation of L2S-DISCO for local search procedure and empirically evaluate it on diverse real-world benchmarks. Results show the efficacy of L2S-DISCO over state-of-the-art algorithms in solving complex optimization problems.
There Is No Agency Without Attention
Bello, Paul (Navy Center for Applied Research in Artificial Intelligence) | Bridewell, Will (Navy Center for Applied Research in Artificial Intelligence)
For decades AI researchers have built agents that are capable of carrying out tasks that require human-level or human-like intelligence. During this time, questions of how these programs compared in kind to humans have surfaced and led to beneficial interdisciplinary discussions, but conceptual progress has been slower than technological progress. Within the past decade, the term agency has taken on new import as intelligent agents have become a noticeable part of our everyday lives. Research on autonomous vehicles and personal assistants has expanded into private industry with new and increasingly capable products surfacing as a matter of routine. This wider use of AI technologies has raised questions about legal and moral agency at the highest levels of government (National Science and Technology Council 2016) and drawn the interest of other academic disciplines and the general public. Within this context, the notion of an intelligent agent in AI is too coarse and in need of refinement. We suggest that the space of AI agents can be subdivided into classes, where each class is defined by an associated degree of control.
Decision-Theoretic Planning with non-Markovian Rewards
Gretton, C., Kabanza, F., Price, D., Slaney, J., Thiebaux, S.
A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decision-theoretic planning, where many desirable behaviours are more naturally expressed as properties of execution sequences rather than as properties of states, NMRDPs form a more natural model than the commonly adopted fully Markovian decision process (MDP) model. While the more tractable solution methods developed for MDPs do not directly apply in the presence of non-Markovian rewards, a number of solution methods for NMRDPs have been proposed in the literature. These all exploit a compact specification of the non-Markovian reward function in temporal logic, to automatically translate the NMRDP into an equivalent MDP which is solved using efficient MDP solution methods. This paper presents NMRDPP (Non-Markovian Reward Decision Process Planner), a software platform for the development and experimentation of methods for decision-theoretic planning with non-Markovian rewards. The current version of NMRDPP implements, under a single interface, a family of methods based on existing as well as new approaches which we describe in detail. These include dynamic programming, heuristic search, and structured methods. Using NMRDPP, we compare the methods and identify certain problem features that affect their performance. NMRDPPs treatment of non-Markovian rewards is inspired by the treatment of domain-specific search control knowledge in the TLPlan planner, which it incorporates as a special case. In the First International Probabilistic Planning Competition, NMRDPP was able to compete and perform well in both the domain-independent and hand-coded tracks, using search control knowledge in the latter.
Knowledge Transfer between Automated Planners
Fernandez, Susana (Universidad Carlos III de Madrid) | Aler, Ricardo (Universidad Carlos III de Madrid) | Borrajo, Daniel (Universidad Carlos III de Madrid)
In this article, we discuss the problem of transferring search heuristics from one planner to another. More specifically, we demonstrate how to transfer the domain-dependent heuristics acquired by one planner into a second planner. Our motivation is to improve the efficiency and the efficacy of the second planner by allowing it to use the transferred heuristics to capture domain regularities that it would not otherwise recognize. Our experimental results show that the transferred knowledge does improve the second planner's performance on novel tasks over a set of seven benchmark planning domains.
The 3rd International Planning Competition: Results and Analysis
This paper reports the outcome of the third in the series of biennial international planning competitions, held in association with the International Conference on AI Planning and Scheduling (AIPS) in 2002. In addition to describing the domains, the planners and the objectives of the competition, the paper includes analysis of the results. The results are analysed from several perspectives, in order to address the questions of comparative performance between planners, comparative difficulty of domains, the degree of agreement between planners about the relative difficulty of individual problem instances and the question of how well planners scale relative to one another over increasingly difficult problems. The paper addresses these questions through statistical analysis of the raw results of the competition, in order to determine which results can be considered to be adequately supported by the data. The paper concludes with a discussion of some challenges for the future of the competition series.
Iterative Learning of Weighted Rule Sets for Greedy Search
Xu, Yuehua (Oregon State University) | Fern, Alan (Oregon State University) | Yoon, Sungwook (Palo Alto Research Center)
Greedy search is commonly used in an attempt to generate solutions quickly at the expense of completeness and optimality. In this work, we consider learning sets of weighted action-selection rules for guiding greedy search with application to automated planning. We make two primary contributions over prior work on learning for greedy search. First, we introduce weighted sets of action-selection rules as a new form of control knowledge for greedy search. Prior work has shown the utility of action-selection rules for greedy search, but has treated the rules as hard constraints, resulting in brittleness. Our weighted rule sets allow multiple rules to vote, helping to improve robustness to noisy rules. Second, we give a new iterative learning algorithm for learning weighted rule sets based on RankBoost, an efficient boosting algorithm for ranking. Each iteration considers the actual performance of the current rule set and directs learning based on the observed search errors. This is in contrast to most prior approaches, which learn control knowledge independently of the search process. Our empirical results have shown significant promise for this approach in a number of domains.
On Planning with Preferences in HTN
Sohrabi, Shirin, McIlraith, Sheila A.
In this paper, we address the problem of generating preferred plans by combining the procedural control knowledge specified by Hierarchical Task Networks (HTNs) with rich qualitative user preferences. The outcome of our work is a language for specifyin user preferences, tailored to HTN planning, together with a provably optimal preference-based planner, HTNPLAN, that is implemented as an extension of SHOP2. To compute preferred plans, we propose an approach based on forward-chaining heuristic search. Our heuristic uses an admissible evaluation function measuring the satisfaction of preferences over partial plans. Our empirical evaluation demonstrates the effectiveness of our HTNPLAN heuristics. We prove our approach sound and optimal with respect to the plans it generates by appealing to a situation calculus semantics of our preference language and of HTN planning. While our implementation builds on SHOP2, the language and techniques proposed here are relevant to a broad range of HTN planners.
Decision-Theoretic Planning with non-Markovian Rewards
Thiebaux, S., Gretton, C., Slaney, J., Price, D., Kabanza, F.
A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decision-theoretic planning, where many desirable behaviours are more naturally expressed as properties of execution sequences rather than as properties of states, NMRDPs form a more natural model than the commonly adopted fully Markovian decision process (MDP) model. While the more tractable solution methods developed for MDPs do not directly apply in the presence of non-Markovian rewards, a number of solution methods for NMRDPs have been proposed in the literature. These all exploit a compact specification of the non-Markovian reward function in temporal logic, to automatically translate the NMRDP into an equivalent MDP which is solved using efficient MDP solution methods. This paper presents NMRDPP (Non-Markovian Reward Decision Process Planner), a software platform for the development and experimentation of methods for decision-theoretic planning with non-Markovian rewards. The current version of NMRDPP implements, under a single interface, a family of methods based on existing as well as new approaches which we describe in detail. These include dynamic programming, heuristic search, and structured methods. Using NMRDPP, we compare the methods and identify certain problem features that affect their performance. NMRDPP's treatment of non-Markovian rewards is inspired by the treatment of domain-specific search control knowledge in the TLPlan planner, which it incorporates as a special case. In the First International Probabilistic Planning Competition, NMRDPP was able to compete and perform well in both the domain-independent and hand-coded tracks, using search control knowledge in the latter.
Approximate Policy Iteration with a Policy Language Bias
Fern, Alan, Yoon, Sungwook, Givan, Robert
We explore approximate policy iteration, replacing the usual costfunction learning step with a learning step in policy space. We give policy-language biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve. In particular, we induce high-quality domain-specific planners for classical planning domains (both deterministic and stochastic variants) by solving such domains as extremely large MDPs.