Logic & Formal Reasoning
Project proposal: A modular reinforcement learning based automated theorem prover
We propose to build a reinforcement learning prover of independent components: a deductive system (an environment), the proof state representation (how an agent sees the environment), and an agent training algorithm. To that purpose, we contribute an additional Vampire-based environment to $\texttt{gym-saturation}$ package of OpenAI Gym environments for saturation provers. We demonstrate a prototype of using $\texttt{gym-saturation}$ together with a popular reinforcement learning framework (Ray $\texttt{RLlib}$). Finally, we discuss our plans for completing this work in progress to a competitive automated theorem prover.
Bayesian Statistical Model Checking for Multi-agent Systems using HyperPCTL*
Das, Spandan, Prabhakar, Pavithra
In this paper, we present a Bayesian method for statistical model checking (SMC) of probabilistic hyperproperties specified in the logic HyperPCTL* on discrete-time Markov chains (DTMCs). While SMC of HyperPCTL* using sequential probability ratio test (SPRT) has been explored before, we develop an alternative SMC algorithm based on Bayesian hypothesis testing. In comparison to PCTL*, verifying HyperPCTL* formulae is complex owing to their simultaneous interpretation on multiple paths of the DTMC. In addition, extending the bottom-up model-checking algorithm of the non-probabilistic setting is not straight forward due to the fact that SMC does not return exact answers to the satisfiability problems of subformulae, instead, it only returns correct answers with high-confidence. We propose a recursive algorithm for SMC of HyperPCTL* based on a modified Bayes' test that factors in the uncertainty in the recursive satisfiability results. We have implemented our algorithm in a Python toolbox, HyProVer, and compared our approach with the SPRT based SMC. Our experimental evaluation demonstrates that our Bayesian SMC algorithm performs better both in terms of the verification time and the number of samples required to deduce satisfiability of a given HyperPCTL* formula.
Finite-Cliquewidth Sets of Existential Rules: Toward a General Criterion for Decidable yet Highly Expressive Querying
Feller, Thomas, Lyon, Tim S., Ostropolski-Nalewaja, Piotr, Rudolph, Sebastian
In our pursuit of generic criteria for decidable ontology-based querying, we introduce 'finite-cliquewidth sets' (FCS) of existential rules, a model-theoretically defined class of rule sets, inspired by the cliquewidth measure from graph theory. By a generic argument, we show that FCS ensures decidability of entailment for a sizable class of queries (dubbed 'DaMSOQs') subsuming conjunctive queries (CQs). The FCS class properly generalizes the class of finite-expansion sets (FES), and for signatures of arity at most 2, the class of bounded-treewidth sets (BTS). For higher arities, BTS is only indirectly subsumed by FCS by means of reification. Despite the generality of FCS, we provide a rule set with decidable CQ entailment (by virtue of first-order-rewritability) that falls outside FCS, thus demonstrating the incomparability of FCS and the class of finite-unification sets (FUS). In spite of this, we show that if we restrict ourselves to single-headed rule sets over signatures of arity at most 2, then FCS subsumes FUS.
Trust in Language Grounding: a new AI challenge for human-robot teams
Bossens, David M., Evers, Christine
The challenge of language grounding is to fully understand natural language by grounding language in real-world referents. While AI techniques are available, the widespread adoption and effectiveness of such technologies for human-robot teams relies critically on user trust. This survey provides three contributions relating to the newly emerging field of trust in language grounding, including a) an overview of language grounding research in terms of AI technologies, data sets, and user interfaces; b) six hypothesised trust factors relevant to language grounding, which are tested empirically on a human-robot cleaning team; and c) future research directions for trust in language grounding.
FOLIO: Natural Language Reasoning with First-Order Logic
Han, Simeng, Schoelkopf, Hailey, Zhao, Yilun, Qi, Zhenting, Riddell, Martin, Benson, Luke, Sun, Lucy, Zubova, Ekaterina, Qiao, Yujie, Burtell, Matthew, Peng, David, Fan, Jonathan, Liu, Yixin, Wong, Brian, Sailor, Malcolm, Ni, Ansong, Nan, Linyong, Kasai, Jungo, Yu, Tao, Zhang, Rui, Joty, Shafiq, Fabbri, Alexander R., Kryscinski, Wojciech, Lin, Xi Victoria, Xiong, Caiming, Radev, Dragomir
We present FOLIO, a human-annotated, open-domain, and logically complex and diverse dataset for reasoning in natural language (NL), equipped with first order logic (FOL) annotations. FOLIO consists of 1,435 examples (unique conclusions), each paired with one of 487 sets of premises which serve as rules to be used to deductively reason for the validity of each conclusion. The logical correctness of premises and conclusions is ensured by their parallel FOL annotations, which are automatically verified by our FOL inference engine. In addition to the main NL reasoning task, NL-FOL pairs in FOLIO automatically constitute a new NL-FOL translation dataset using FOL as the logical form. Our experiments on FOLIO systematically evaluate the FOL reasoning ability of supervised fine-tuning on medium-sized language models (BERT, RoBERTa) and few-shot prompting on large language models (GPT-NeoX, OPT, GPT-3, Codex). For NL-FOL translation, we experiment with GPT-3 and Codex. Our results show that one of the most capable Large Language Model (LLM) publicly available, GPT-3 davinci, achieves only slightly better than random results with few-shot prompting on a subset of FOLIO, and the model is especially bad at predicting the correct truth values for False and Unknown conclusions. Our dataset and code are available at https://github.com/Yale-LILY/FOLIO.
Learning Automata-Based Complex Event Patterns in Answer Set Programming
Katzouris, Nikos, Paliouras, Georgios
Complex Event Recognition and Forecasting (CER/F) techniques attempt to detect, or even forecast ahead of time, event occurrences in streaming input using predefined event patterns. Such patterns are not always known in advance, or they frequently change over time, making machine learning techniques, capable of extracting such patterns from data, highly desirable in CER/F. Since many CER/F systems use symbolic automata to represent such patterns, we propose a family of such automata where the transition-enabling conditions are defined by Answer Set Programming (ASP) rules, and which, thanks to the strong connections of ASP to symbolic learning, are directly learnable from data. We present such a learning approach in ASP and an incremental version thereof that trades optimality for efficiency and is capable to scale to large datasets. We evaluate our approach on two CER datasets and compare it to state-of-the-art automata learning techniques, demonstrating empirically a superior performance, both in terms of predictive accuracy and scalability.
Amazon AI Research Presents A Formal Method To Locate Root Causes Of Outliers
Rare observations where a system deviates from its typical behavior are known as outliers. They appear in numerous real-world applications (such as medicine and finance) and necessitate more explanation than regular occurrences. Outliers in the training dataset can cause problems during a model fitting in machine learning (esp., linear models). It may also result in inflated error metrics that give larger errors more weight. As a result, it is necessary to treat outliers and determine their root cause before developing a machine learning model.
A logical theory for strong and weak ontic necessities in branching time
Ontic necessities are those modalities universally quantifying over domains of ontic possibilities, whose ``existence'' is independent of our knowledge. An ontic necessity, called the weak ontic necessity, causes interesting questions. An example for it is ``I should be dead by now''. A feature of this necessity is whether it holds at a state has nothing to do with whether its prejacent holds at the state. Is there a weak epistemic necessity expressed by ``should''? Is there a strong ontic necessity expressed by ``must''? How do we make sense of the strong and weak ontic necessities formally? In this paper, we do the following work. Firstly, we recognize strong/weak ontic/epistemic necessities and give our general ideas about them. Secondly, we present a complete logical theory for the strong and weak ontic necessities in branching time. This theory is based on the following approach. The weak ontic necessity quantifies over a domain of expected timelines, determined by the agent's system of ontic rules. The strong ontic necessity quantifies over a domain of accepted timelines, determined by undefeatable ontic rules.
Automating UAV Flight Readiness Approval using Goal-Directed Answer Set Programming
Varanasi, Sarat Chandra, Meng, Baoluo, Alexander, Christopher, Borgyos, Szabolcs, Hall, Brendan
We present a novel application of Goal-Directed Answer Set Programming that digitizes the model aircraft operator's compliance verification against the Academy of Model Aircrafts (AMA) safety code. The AMA safety code regulates how AMA flyers operate Unmanned Aerial Vehicles (UAVs) for limited recreational purposes. Flying drones and their operators are subject to various rules before and after the operation of the aircraft to ensure safe flights. In this paper, we leverage Answer Set Programming to encode the AMA safety code and automate compliance checks. To check compliance, we use the s(CASP) which is a goal-directed ASP engine. By using s(CASP) the operators can easily check for violations and obtain a justification tree explaining the cause of the violations in human-readable natural language. Further, we implement an algorithm to help the operators obtain the minimal set of conditions that need to be satisfied in order to pass the compliance check. We develop a front-end questionnaire interface that accepts various conditions and use the backend s(CASP) engine to evaluate whether the conditions adhere to the regulations. We also leverage s(CASP) implemented in SWI-Prolog, where SWI-Prolog exposes the reasoning capabilities of s(CASP) as a REST service. To the best of our knowledge, this is the first application of ASP in the AMA and Avionics Compliance and Certification space.
Advanced Tools and Methods for Treewidth-Based Problem Solving -- Extended Abstract
Computer programs, so-called solvers, for solving the well-known Boolean satisfiability problem (Sat) have been improving for decades. Among the reasons, why these solvers are so fast, is the implicit usage of the formula's structural properties during solving. One of such structural indicators is the so-called treewidth, which tries to measure how close a formula instance is to being easy (tree-like). This work focuses on logic-based problems and treewidth-based methods and tools for solving them. Many of these problems are also relevant for knowledge representation and reasoning (KR) as well as artificial intelligence (AI) in general. We present a new type of problem reduction, which is referred to by decomposition-guided (DG). This reduction type forms the basis to solve a problem for quantified Boolean formulas (QBFs) of bounded treewidth that has been open since 2004. The solution of this problem then gives rise to a new methodology for proving precise lower bounds for a range of further formalisms in logic, KR, and AI. Despite the established lower bounds, we implement an algorithm for solving extensions of Sat efficiently, by directly using treewidth. Our implementation is based on finding abstractions of instances, which are then incrementally refined in the process. Thereby, our observations confirm that treewidth is an important measure that should be considered in the design of modern solvers.