Goto

Collaborating Authors

 Logic & Formal Reasoning


Synthesizing Reactive Test Environments for Autonomous Systems: Testing Reach-Avoid Specifications with Multi-Commodity Flows

arXiv.org Artificial Intelligence

We study automated test generation for verifying discrete decision-making modules in autonomous systems. We utilize linear temporal logic to encode the requirements on the system under test in the system specification and the behavior that we want to observe during the test is given as the test specification which is unknown to the system. First, we use the specifications and their corresponding non-deterministic B\"uchi automata to generate the specification product automaton. Second, a virtual product graph representing the high-level interaction between the system and the test environment is constructed modeling the product automaton encoding the system, the test environment, and specifications. The main result of this paper is an optimization problem, framed as a multi-commodity network flow problem, that solves for constraints on the virtual product graph which can then be projected to the test environment. Therefore, the result of the optimization problem is reactive test synthesis that ensures that the system meets the test specifications along with satisfying the system specifications. This framework is illustrated in simulation on grid world examples, and demonstrated on hardware with the Unitree A1 quadruped, wherein dynamic locomotion behaviors are verified in the context of reactive test environments.


Formal Specifications from Natural Language

arXiv.org Artificial Intelligence

We study the generalization abilities of language models when translating natural language into formal specifications with complex semantics. In particular, we fine-tune language models on three datasets consisting of English sentences and their corresponding formal representation: 1) regular expressions (regex), frequently used in programming and search; 2) First-order logic (FOL), commonly used in software verification and theorem proving; and 3) linear-time temporal logic (LTL), which forms the basis for industrial hardware specification languages. Our experiments show that, in these diverse domains, the language models maintain their generalization capabilities from pre-trained knowledge of natural language to generalize, e.g., to new variable names or operator descriptions. Additionally, they achieve competitive performance, and even outperform the state-of-the-art for translating into regular expressions, with the benefits of being easy to access, efficient to fine-tune, and without a particular need for domain-specific reasoning.


Knowledge-based and Data-driven Reasoning and Learning for Ad Hoc Teamwork

arXiv.org Artificial Intelligence

We present an architecture for ad hoc teamwork, which refers to collaboration in a team of agents without prior coordination. State of the art methods for this problem often include a data-driven component that uses a long history of prior observations to model the behaviour of other agents (or agent types) and to determine the ad hoc agent's behaviour. In many practical domains, it is challenging to find large training datasets, and necessary to understand and incrementally extend the existing models to account for changes in team composition or domain attributes. Our architecture combines the principles of knowledge-based and data-driven reasoning and learning. Specifically, we enable an ad hoc agent to perform non-monotonic logical reasoning with prior commonsense domain knowledge and incrementally-updated simple predictive models of other agents' behaviour. We use the benchmark simulated multi-agent collaboration domain Fort Attack to demonstrate that our architecture supports adaptation to unforeseen changes, incremental learning and revision of models of other agents' behaviour from limited samples, transparency in the ad hoc agent's decision making, and better performance than a data-driven baseline.


Learning to Find Proofs and Theorems by Learning to Refine Search Strategies: The Case of Loop Invariant Synthesis

arXiv.org Artificial Intelligence

We propose a new approach to automated theorem proving where an AlphaZero-style agent is self-training to refine a generic high-level expert strategy expressed as a nondeterministic program. An analogous teacher agent is self-training to generate tasks of suitable relevance and difficulty for the learner. This allows leveraging minimal amounts of domain knowledge to tackle problems for which training data is unavailable or hard to synthesize. As a specific illustration, we consider loop invariant synthesis for imperative programs and use neural networks to refine both the teacher and solver strategies.


A Direct Approximation of AIXI Using Logical State Abstractions

arXiv.org Artificial Intelligence

We propose a practical integration of logical state abstraction with AIXI, a Bayesian optimality notion for reinforcement learning agents, to significantly expand the model class that AIXI agents can be approximated over to complex history-dependent and structured environments. The state representation and reasoning framework is based on higher-order logic, which can be used to define and enumerate complex features on non-Markovian and structured environments. We address the problem of selecting the right subset of features to form state abstractions by adapting the $\Phi$-MDP optimisation criterion from state abstraction theory. Exact Bayesian model learning is then achieved using a suitable generalisation of Context Tree Weighting over abstract state sequences. The resultant architecture can be integrated with different planning algorithms. Experimental results on controlling epidemics on large-scale contact networks validates the agent's performance.


Synthesizing explainable counterfactual policies for algorithmic recourse with program synthesis

arXiv.org Artificial Intelligence

Being able to provide counterfactual interventions - sequences of actions we would have had to take for a desirable outcome to happen - is essential to explain how to change an unfavourable decision by a black-box machine learning model (e.g., being denied a loan request). Existing solutions have mainly focused on generating feasible interventions without providing explanations on their rationale. Moreover, they need to solve a separate optimization problem for each user. In this paper, we take a different approach and learn a program that outputs a sequence of explainable counterfactual actions given a user description and a causal graph. We leverage program synthesis techniques, reinforcement learning coupled with Monte Carlo Tree Search for efficient exploration, and rule learning to extract explanations for each recommended action. An experimental evaluation on synthetic and real-world datasets shows how our approach generates effective interventions by making orders of magnitude fewer queries to the black-box classifier with respect to existing solutions, with the additional benefit of complementing them with interpretable explanations.


Communication between agents in dynamic epistemic logic

arXiv.org Artificial Intelligence

This manuscript studies actions of communication between epistemic logic agents. It starts by looking into actions through which all/some agents share all their information, defining the model operation that transforms the model, discussing its properties, introducing a modality for describing it and providing an axiom system for the latter. The main part of the manuscript focuses on an action through which some agents share part of their information: they share all that they know about a topic defined by a given formula. Once again, the manuscript defines the model operation that transforms the model, discusses its properties, introduces a modality for describing it and provides an axiom system for the latter.


Treewidth-aware Reductions of Normal ASP to SAT -- Is Normal ASP Harder than SAT after All?

arXiv.org Artificial Intelligence

Answer Set Programming (ASP) is a paradigm for modeling and solving problems for knowledge representation and reasoning. There are plenty of results dedicated to studying the hardness of (fragments of) ASP. So far, these studies resulted in characterizations in terms of computational complexity as well as in fine-grained insights presented in form of dichotomy-style results, lower bounds when translating to other formalisms like propositional satisfiability (SAT), and even detailed parameterized complexity landscapes. A generic parameter in parameterized complexity originating from graph theory is the so-called treewidth, which in a sense captures structural density of a program. Recently, there was an increase in the number of treewidth-based solvers related to SAT. While there are translations from (normal) ASP to SAT, no reduction that preserves treewidth or at least keeps track of the treewidth increase is known. In this paper we propose a novel reduction from normal ASP to SAT that is aware of the treewidth, and guarantees that a slight increase of treewidth is indeed sufficient. Further, we show a new result establishing that, when considering treewidth, already the fragment of normal ASP is slightly harder than SAT (under reasonable assumptions in computational complexity). This also confirms that our reduction probably cannot be significantly improved and that the slight increase of treewidth is unavoidable. Finally, we present an empirical study of our novel reduction from normal ASP to SAT, where we compare treewidth upper bounds that are obtained via known decomposition heuristics. Overall, our reduction works better with these heuristics than existing translations.


Machine Learning Meets The Herbrand Universe

arXiv.org Artificial Intelligence

The appearance of strong CDCL-based propositional (SAT) solvers has greatly advanced several areas of automated reasoning (AR). One of the directions in AR is thus to apply SAT solvers to expressive formalisms such as first-order logic, for which large corpora of general mathematical problems exist today. This is possible due to Herbrand's theorem, which allows reduction of first-order problems to propositional problems by instantiation. The core challenge is choosing the right instances from the typically infinite Herbrand universe. In this work, we develop the first machine learning system targeting this task, addressing its combinatorial and invariance properties. In particular, we develop a GNN2RNN architecture based on an invariant graph neural network (GNN) that learns from problems and their solutions independently of symbol names (addressing the abundance of skolems), combined with a recurrent neural network (RNN) that proposes for each clause its instantiations. The architecture is then trained on a corpus of mathematical problems and their instantiation-based proofs, and its performance is evaluated in several ways. We show that the trained system achieves high accuracy in predicting the right instances, and that it is capable of solving many problems by educated guessing when combined with a ground solver. To our knowledge, this is the first convincing use of machine learning in synthesizing relevant elements from arbitrary Herbrand universes.


Opportunistic Qualitative Planning in Stochastic Systems with Incomplete Preferences over Reachability Objectives

arXiv.org Artificial Intelligence

Preferences play a key role in determining what goals/constraints to satisfy when not all constraints can be satisfied simultaneously. In this paper, we study how to synthesize preference satisfying plans in stochastic systems, modeled as an MDP, given a (possibly incomplete) combinative preference model over temporally extended goals. We start by introducing new semantics to interpret preferences over infinite plays of the stochastic system. Then, we introduce a new notion of improvement to enable comparison between two prefixes of an infinite play. Based on this, we define two solution concepts called safe and positively improving (SPI) and safe and almost-surely improving (SASI) that enforce improvements with a positive probability and with probability one, respectively. We construct a model called an improvement MDP, in which the synthesis of SPI and SASI strategies that guarantee at least one improvement reduces to computing positive and almost-sure winning strategies in an MDP. We present an algorithm to synthesize the SPI and SASI strategies that induce multiple sequential improvements. We demonstrate the proposed approach using a robot motion planning problem.