Logic & Formal Reasoning
A Relational Tsetlin Machine with Applications to Natural Language Understanding
Saha, Rupsa, Granmo, Ole-Christoffer, Zadorozhny, Vladimir I., Goodwin, Morten
TMs are a pattern recognition approach that uses finite state machines for learning and propositional logic to represent patterns. In addition to being natively interpretable, they have provided competitive accuracy for various tasks. In this paper, we increase the computing power of TMs by proposing a first-order logic-based framework with Herbrand semantics. The resulting TM is relational and can take advantage of logical structures appearing in natural language, to learn rules that represent how actions and consequences are related in the real world. The outcome is a logic program of Horn clauses, bringing in a structured view of unstructured data. In closed-domain question-answering, the first-order representation produces 10x more compact KBs, along with an increase in answering accuracy from 94.83% to 99.48%. The approach is further robust towards erroneous, missing, and superfluous information, distilling the aspects of a text that are important for real-world understanding.
Inductive logic programming at 30
Cropper, Andrew, Dumančić, Sebastijan, Evans, Richard, Muggleton, Stephen H.
Inductive logic programming (ILP) is a form of logic-based machine learning. The goal of ILP is to induce a hypothesis (a logic program) that generalises given training examples and background knowledge. As ILP turns 30, we survey recent work in the field. In this survey, we focus on (i) new meta-level search methods, (ii) techniques for learning recursive programs that generalise from few examples, (iii) new approaches for predicate invention, and (iv) the use of different technologies, notably answer set programming and neural networks. We conclude by discussing some of the current limitations of ILP and discuss directions for future research.
SAT-based Circuit Local Improvement
Kulikov, Alexander S., Slezkin, Nikita
Finding exact circuit size is a notorious optimization problem in practice. Whereas modern computers and algorithmic techniques allow to find a circuit of size seven in blink of an eye, it may take more than a week to search for a circuit of size thirteen. One of the reasons of this behavior is that the search space is enormous: the number of circuits of size $s$ is $s^{\Theta(s)}$, the number of Boolean functions on $n$ variables is $2^{2^n}$. In this paper, we explore the following natural heuristic idea for decreasing the size of a given circuit: go through all its subcircuits of moderate size and check whether any of them can be improved by reducing to SAT. This may be viewed as a local search approach: we search for a smaller circuit in a ball around a given circuit. We report the results of experiments with various symmetric functions.
Refinement Type Directed Search for Meta-Interpretive-Learning of Higher-Order Logic Programs
The program synthesis problem within the Inductive Logic Programming (ILP) community has typically been seen as untyped. We consider the benefits of user provided types on background knowledge. Building on the Meta-Interpretive Learning (MIL) framework, we show that type checking is able to prune large parts of the hypothesis space of programs. The introduction of polymorphic type checking to the MIL approach to logic program synthesis is validated by strong theoretical and experimental results, showing a cubic reduction in the size of the search space and synthesis time, in terms of the number of typed background predicates. Additionally we are able to infer polymorphic types of synthesized clauses and of entire programs. The other advancement is in developing an approach to leveraging refinement types in ILP. Here we show that further pruning of the search space can be achieved, though the SMT solving used for refinement type checking comes at a significant cost timewise.
Learning Logic Programs by Explaining Failures
Scientists form hypotheses and experimentally test them. If a hypothesis fails (is refuted), scientists try to explain the failure to eliminate other hypotheses. We introduce similar explanation techniques for inductive logic programming (ILP). We build on the ILP approach learning from failures. Given a hypothesis represented as a logic program, we test it on examples. If a hypothesis fails, we identify clauses and literals responsible for the failure. By explaining failures, we can eliminate other hypotheses that will provably fail. We introduce a technique for failure explanation based on analysing SLD-trees. We experimentally evaluate failure explanation in the Popper ILP system. Our results show that explaining failures can drastically reduce learning times.
An asymptotic analysis of probabilistic logic programming with implications for expressing projective families of distributions
Over the last years, there has been increasing research on the scaling behaviour of statistical relational representations with the size of the domain, and on the connections between domain size dependence and lifted inference. In particular, the asymptotic behaviour of statistical relational representations has come under scrutiny, and projectivity was isolated as the strongest form of domain size independence. In this contribution we show that every probabilistic logic program under the distribution semantics is asymptotically equivalent to a probabilistic logic program consisting only of range-restricted clauses over probabilistic facts. To facilitate the application of classical results from finite model theory, we introduce the abstract distribution semantics, defined as an arbitrary logical theory over probabilistic facts to bridge the gap to the distribution semantics underlying probabilistic logic programming. In this representation, range-restricted logic programs correspond to quantifier-free theories, making asymptotic quantifier results avilable for use. We can conclude that every probabilistic logic program inducing a projective family of distributions is in fact captured by this class, and we can infer interesting consequences for the expressivity of probabilistic logic programs as well as for the asymptotic behaviour of probabilistic rules.
A Qualitative Theory of Cognitive Attitudes and their Change
Since the seminal work of Hintikka on epistemic logic [28], of Von Wright on the logic of preference [55, 56] and of Cohen & Levesque on the logic of intention [19], many formal logics for reasoning about cognitive attitudes of agents such as knowledge and belief [24], preference [32, 48], desire [23], intention [44, 30] and their combination [38, 54] have been proposed. Generally speaking, these logics are nothing but formal models of rational agency relying on the idea that an agent endowed with cognitive attitudes makes decisions on the basis of what she believes and of what she desires or prefers. The idea of describing rational agents in terms of their epistemic and motivational attitudes is something that these logics share with classical decision theory and game theory. Classical decision theory and game theory provide a quantitative account of individual and strategic decision-making by assuming that agents' beliefs and desires can be respectively modeled by subjective probabilities and utilities. Qualitative approaches to individual and strategic decision-making have been proposed in AI [16, 22] to characterize criteria that a rational agent should adopt for making decisions when she cannot build a probability distribution over the set of possible events and her preference over the set of possible outcomes cannot be expressed by a utility function but only by a qualitative ordering over the outcomes.
Nominal Unification and Matching of Higher Order Expressions with Recursive Let
Schmidt-Schauß, Manfred, Kutsia, Temur, Levy, Jordi, Villaret, Mateu, Kutz, Yunus
A sound and complete algorithm for nominal unification of higher-order expressions with a recursive let is described, and shown to run in nondeterministic polynomial time. We also explore specializations like nominal letrec-matching for expressions, for DAGs, and for garbage-free expressions and determine their complexity. Finally, we also provide a nominal unification algorithm for higher-order expressions with recursive let and atom-variables, where we show that it also runs in nondeterministic polynomial time.
Proof Artifact Co-training for Theorem Proving with Language Models
Han, Jesse Michael, Rute, Jason, Wu, Yuhuai, Ayers, Edward W., Polu, Stanislas
Labeled data for imitation learning of theorem proving in large libraries of formalized mathematics is scarce as such libraries require years of concentrated effort by human specialists to be built. This is particularly challenging when applying large Transformer language models to tactic prediction, because the scaling of performance with respect to model size is quickly disrupted in the data-scarce, easily-overfitted regime. We propose PACT ({\bf P}roof {\bf A}rtifact {\bf C}o-{\bf T}raining), a general methodology for extracting abundant self-supervised data from kernel-level proof terms for co-training alongside the usual tactic prediction objective. We apply this methodology to Lean, an interactive proof assistant which hosts some of the most sophisticated formalized mathematics to date. We instrument Lean with a neural theorem prover driven by a Transformer language model and show that PACT improves theorem proving success rate on a held-out suite of test theorems from 32\% to 48\%.
A Compositional Atlas of Tractable Circuit Operations: From Simple Transformations to Complex Information-Theoretic Queries
Vergari, Antonio, Choi, YooJung, Liu, Anji, Teso, Stefano, Broeck, Guy Van den
Circuit representations are becoming the lingua franca to express and reason about tractable generative and discriminative models. In this paper, we show how complex inference scenarios for these models that commonly arise in machine learning -- from computing the expectations of decision tree ensembles to information-theoretic divergences of deep mixture models -- can be represented in terms of tractable modular operations over circuits. Specifically, we characterize the tractability of a vocabulary of simple transformations -- sums, products, quotients, powers, logarithms, and exponentials -- in terms of sufficient structural constraints of the circuits they operate on, and present novel hardness results for the cases in which these properties are not satisfied. Building on these operations, we derive a unified framework for reasoning about tractable models that generalizes several results in the literature and opens up novel tractable inference scenarios.