Goto

Collaborating Authors

 Ståhlberg, Simon


Learning to Ground Existentially Quantified Goals

arXiv.org Artificial Intelligence

Goal instructions for autonomous AI agents cannot assume that objects have unique names. Instead, objects in goals must be referred to by providing suitable descriptions. However, this raises problems in both classical planning and generalized planning. The standard approach to handling existentially quantified goals in classical planning involves compiling them into a DNF formula that encodes all possible variable bindings and adding dummy actions to map each DNF term into the new, dummy goal. This preprocessing is exponential in the number of variables. In generalized planning, the problem is different: even if general policies can deal with any initial situation and goal, executing a general policy requires the goal to be grounded to define a value for the policy features. The problem of grounding goals, namely finding the objects to bind the goal variables, is subtle: it is a generalization of classical planning, which is a special case when there are no goal variables to bind, and constraint reasoning, which is a special case when there are no actions. In this work, we address the goal grounding problem with a novel supervised learning approach. A GNN architecture, trained to predict the cost of partially quantified goals over small domain instances is tested on larger instances involving more objects and different quantified goals. The proposed architecture is evaluated experimentally over several planning domains where generalization is tested along several dimensions including the number of goal variables and objects that can bind such variables. The scope of the approach is also discussed in light of the known relationship between GNNs and C2 logics.


Symmetries and Expressive Requirements for Learning General Policies

arXiv.org Artificial Intelligence

State symmetries play an important role in planning and generalized planning. In the first case, state symmetries can be used to reduce the size of the search; in the second, to reduce the size of the training set. In the case of general planning, however, it is also critical to distinguish non-symmetric states, i.e., states that represent non-isomorphic relational structures. However, while the language of first-order logic distinguishes non-symmetric states, the languages and architectures used to represent and learn general policies do not. In particular, recent approaches for learning general policies use state features derived from description logics or learned via graph neural networks (GNNs) that are known to be limited by the expressive power of C_2, first-order logic with two variables and counting. In this work, we address the problem of detecting symmetries in planning and generalized planning and use the results to assess the expressive requirements for learning general policies over various planning domains. For this, we map planning states to plain graphs, run off-the-shelf algorithms to determine whether two states are isomorphic with respect to the goal, and run coloring algorithms to determine if C_2 features computed logically or via GNNs distinguish non-isomorphic states. Symmetry detection results in more effective learning, while the failure to detect non-symmetries prevents general policies from being learned at all in certain domains.


Learning General Policies for Classical Planning Domains: Getting Beyond C$_2$

arXiv.org Artificial Intelligence

GNN-based approaches for learning general policies across planning domains are limited by the expressive power of $C_2$, namely; first-order logic with two variables and counting. This limitation can be overcomed by transitioning to $k$-GNNs, for $k=3$, wherein object embeddings are substituted with triplet embeddings. Yet, while $3$-GNNs have the expressive power of $C_3$, unlike $1$- and $2$-GNNs that are confined to $C_2$, they require quartic time for message exchange and cubic space for embeddings, rendering them impractical. In this work, we introduce a parameterized version of relational GNNs. When $t$ is infinity, R-GNN[$t$] approximates $3$-GNNs using only quadratic space for embeddings. For lower values of $t$, such as $t=1$ and $t=2$, R-GNN[$t$] achieves a weaker approximation by exchanging fewer messages, yet interestingly, often yield the $C_3$ features required in several planning domains. Furthermore, the new R-GNN[$t$] architecture is the original R-GNN architecture with a suitable transformation applied to the input states only. Experimental results illustrate the clear performance gains of R-GNN[$1$] and R-GNN[$2$] over plain R-GNNs, and also over edge transformers that also approximate $3$-GNNs.


Learning General Optimal Policies with Graph Neural Networks: Expressive Power, Transparency, and Limits

arXiv.org Artificial Intelligence

It has been recently shown that general policies for many classical planning domains can be expressed and learned in terms of a pool of features defined from the domain predicates using a description logic grammar. At the same time, most description logics correspond to a fragment of $k$-variable counting logic ($C_k$) for $k=2$, that has been shown to provide a tight characterization of the expressive power of graph neural networks. In this work, we make use of these results to understand the power and limits of using graph neural networks (GNNs) for learning optimal general policies over a number of tractable planning domains where such policies are known to exist. For this, we train a simple GNN in a supervised manner to approximate the optimal value function $V^{*}(s)$ of a number of sample states $s$. As predicted by the theory, it is observed that general optimal policies are obtained in domains where general optimal value functions can be defined with $C_2$ features but not in those requiring more expressive $C_3$ features. In addition, it is observed that the features learned are in close correspondence with the features needed to express $V^{*}$ in closed form. The theory and the analysis of the domains let us understand the features that are actually learned as well as those that cannot be learned in this way, and let us move in a principled manner from a combinatorial optimization approach to learning general policies to a potentially, more robust and scalable approach based on deep learning.


Tailoring Pattern Databases for Unsolvable Planning Instances

AAAI Conferences

There has been an astounding improvement in domain-independent planning for solvable instances over the last decades and planners have become increasingly efficient at constructing plans. However, this advancement has not been matched by a similar improvement for identifying unsolvable instances. In this paper, we specialise pattern databases for dead-end detection and, thus, for detecting unsolvable instances. We propose two methods of constructing pattern collections and show that spending any more time constructing the pattern collection is likely to be unproductive. In other words, very few other pattern collections within the given space bounds are able to detect more dead-ends. We show this by carrying out a novel statistical analysis: a large computer cluster has been used to approximate the limit of pattern collections with respect to dead-end detection for many unsolvable instances, and this information is used in the analysis of the proposed methods. Consequently, further improvement must come from combining pattern databases with other techniques, such as mutexes. Furthermore, we explain why one of the proposed methods tends to find significantly more unsolvable variable projections, which is desirable since they imply that the instance is unsolvable. Finally, we compare the best proposed method with the winner and the runner up of the first unsolvability international planning competition, and show that the method is competitive.


Tractable Cost-Optimal Planning over Restricted Polytree Causal Graphs

AAAI Conferences

Causal graphs are widely used to analyze the complexity of planning problems. Many tractable classes have been identified with their aid and state-of-the-art heuristics have been derived by exploiting such classes. In particular, Katz and Keyder have studied causal graphs that are hourglasses (which is a generalization of forks and inverted-forks) and shown that the corresponding cost-optimal planning problem is tractable under certain restrictions. We continue this work by studying polytrees (which is a generalization of hourglasses) under similar restrictions. We prove tractability of cost-optimal planning by providing an algorithm based on a novel notion of variable isomorphism. Our algorithm also sheds light on the k-consistency procedure for identifying unsolvable planning instances. We speculate that this may, at least partially, explain why merge-and-shrink heuristics have been successful for recognizing unsolvable instances.