Logic & Formal Reasoning
Synthesize, Execute and Debug: Learning to Repair for Neural Program Synthesis
Gupta, Kavi, Christensen, Peter Ebert, Chen, Xinyun, Song, Dawn
The use of deep learning techniques has achieved significant progress for program synthesis from input-output examples. However, when the program semantics become more complex, it still remains a challenge to synthesize programs that are consistent with the specification. In this work, we propose SED, a neural program generation framework that incorporates synthesis, execution, and debugging stages. Instead of purely relying on the neural program synthesizer to generate the final program, SED first produces initial programs using the neural program synthesizer component, then utilizes a neural program debugger to iteratively repair the generated programs. The integration of the debugger component enables SED to modify the programs based on the execution results and specification, which resembles the coding process of human programmers. On Karel, a challenging input-output program synthesis benchmark, SED reduces the error rate of the neural program synthesizer itself by a considerable margin, and outperforms the standard beam search for decoding.
Neural-Symbolic Integration: A Compositional Perspective
Tsamoura, Efthymia, Michael, Loizos
Despite significant progress in the development of neural-symbolic frameworks, the question of how to integrate a neural and a symbolic system in a \emph{compositional} manner remains open. Our work seeks to fill this gap by treating these two systems as black boxes to be integrated as modules into a single architecture, without making assumptions on their internal structure and semantics. Instead, we expect only that each module exposes certain methods for accessing the functions that the module implements: the symbolic module exposes a deduction method for computing the function's output on a given input, and an abduction method for computing the function's inputs for a given output; the neural module exposes a deduction method for computing the function's output on a given input, and an induction method for updating the function given input-output training instances. We are, then, able to show that a symbolic module -- with any choice for syntax and semantics, as long as the deduction and abduction methods are exposed -- can be cleanly integrated with a neural module, and facilitate the latter's efficient training, achieving empirical performance that exceeds that of previous work.
On Finite and Unrestricted Query Entailment beyond SQ with Number Restrictions on Transitive Roles
Gogacz, Thomas, Gutiรฉrrez-Basulto, Vรญctor, Ibรกรฑez-Garcรญa, Yazmรญn, Jung, Jean Christoph, Murlak, Filip
We study the description logic SQ with number restrictions applicable to transitive roles, extended with either nominals or inverse roles. We show tight 2EXPTIME upper bounds for unrestricted entailment of regular path queries for both extensions and finite entailment of positive existential queries for nominals. For inverses, we establish 2EXPTIME-completeness for unrestricted and finite entailment of instance queries (the latter under restriction to a single, transitive role).
Measuring Systematic Generalization in Neural Proof Generation with Transformers
Gontier, Nicolas, Sinha, Koustuv, Reddy, Siva, Pal, Christopher
We are interested in understanding how well Transformer language models (TLMs) can perform reasoning tasks when trained on knowledge encoded in the form of natural language. We investigate their systematic generalization abilities on a logical reasoning task in natural language, which involves reasoning over relationships between entities grounded in first-order logical proofs. Specifically, we perform soft theorem-proving by leveraging TLMs to generate natural language proofs. We test the generated proofs for logical consistency, along with the accuracy of the final inference. We observe length-generalization issues when evaluated on longer-than-trained sequences. However, we observe TLMs improve their generalization performance after being exposed to longer, exhaustive proofs. In addition, we discover that TLMs are able to generalize better using backward-chaining proofs compared to their forward-chaining counterparts, while they find it easier to generate forward chaining proofs. We observe that models that are not trained to generate proofs are better at generalizing to problems based on longer proofs. This suggests that Transformers have efficient internal reasoning strategies that are harder to interpret. These results highlight the systematic generalization behavior of TLMs in the context of logical reasoning, and we believe this work motivates deeper inspection of their underlying reasoning strategies.
SeLFiE: Modular Semantic Reasoning for Induction in Isabelle/HOL
Proof assistants offer tactics to apply proof by induction, but these tactics rely on inputs given by human engineers. We address this problem with SeLFiE, a domain-specific language to encode experienced users' expertise on how to apply the induct tactic in Isabelle/HOL: when we apply an induction heuristic written in SeLFiE to an inductive problem and arguments to the induct tactic, the SeLFiE interpreter examines both the syntactic structure of the problem and semantics of the relevant constants to judge whether the arguments to the induct tactic are plausible for that problem according to the heuristic.
Properties of Switch-List Representations of Boolean Functions
Chromรฝ, Miloลก (Charles University, Faculty of Mathematics and Physics, Department of Theoretical Computer Science and Mathematical Logic) | ฤepek, Ondลej (Charles University, Faculty of Mathematics and Physics, Department of Theoretical Computer Science and Mathematical Logic)
In this paper, we focus on a less usual way to represent Boolean functions, namely on representations by switch-lists, which are closely related to interval representations. Given a truth table representation of a Boolean function f the switch-list representation of f is a list of Boolean vectors from the truth table which have a different function value than the preceding Boolean vector in the truth table. The main aim of this paper is to include this type of representation in the Knowledge Compilation Map by Darwiche and Marquis and to argue that switch-lists may in certain situations constitute a reasonable choice for a target language in knowledge compilation. First, we compare switch-list representations with a number of standard representations (such as CNF, DNF, and OBDD) with respect to their relative succinctness. As a by-product of this analysis, we also give a short proof of a longstanding open question proposed by Darwiche and Marquis, namely the incomparability of MODS (models) and PI (prime implicates) representations. Next, using the succinctness result between switch-lists and OBDDs, we develop a polynomial time compilation algorithm from switch-lists to OBDDs. Finally, we analyze which standard transformations and queries (those considered by Darwiche and Marquis) can be performed in polynomial time with respect to the size of the input if the input knowledge is represented by a switch-list. We show that this collection is very broad and the combination of polynomial time transformations and queries is quite unique. Some of the queries can be answered directly using the switch-list input, others require a compilation of the input to OBDD representations which are then used to answer the queries.
A Defeasible Calculus for Zetetic Agents
The study of defeasible reasoning unites epistemologists with those working in AI, in part, because both are interested in epistemic rationality. While it is traditionally thought to govern the formation and (with)holding of beliefs, epistemic rationality may also apply to the interrogative attitudes associated with our core epistemic practice of inquiry, such as wondering, investigating, and curiosity. Since generally intelligent systems should be capable of rational inquiry, AI researchers have a natural interest in the norms that govern interrogative attitudes. Following its recent coinage, we use the term "zetetic" to refer to the properties and norms associated with the capacity to inquire. In this paper, we argue that zetetic norms can be modeled via defeasible inferences to and from questions---a.k.a erotetic inferences---in a manner similar to the way norms of epistemic rationality are represented by defeasible inference rules. We offer a sequent calculus that accommodates the unique features of "erotetic defeat" and that exhibits the computational properties needed to inform the design of zetetic agents. The calculus presented here is an improved version of the one presented in Millson (2019), extended to cover a new class of defeasible erotetic inferences.
Exact Symbolic Inference in Probabilistic Programs via Sum-Product Representations
Saad, Feras A., Rinard, Martin C., Mansinghka, Vikash K.
We present the Sum-Product Probabilistic Language (SPPL), a new system that automatically delivers exact solutions to a broad range of probabilistic inference queries. SPPL symbolically represents the full distribution on execution traces specified by a probabilistic program using a generalization of sum-product networks. SPPL handles continuous and discrete distributions, many-to-one numerical transformations, and a query language that includes general predicates on random variables. We formalize SPPL in terms of a novel translation strategy from probabilistic programs to a semantic domain of sum-product representations, present new algorithms for exactly conditioning on and computing probabilities of queries, and prove their soundness under the semantics. We present techniques for improving the scalability of translation and inference by automatically exploiting conditional independences and repeated structure in SPPL programs. We implement a prototype of SPPL with a modular architecture and evaluate it on a suite of common benchmarks, which establish that our system is up to 3500x faster than state-of-the-art systems for fairness verification; up to 1000x faster than state-of-the-art symbolic algebra techniques; and can compute exact probabilities of rare events in milliseconds.
Abductive Knowledge Induction From Raw Data
Dai, Wang-Zhou, Muggleton, Stephen H.
For many reasoning-heavy tasks, it is challenging to find an appropriate end-to-end differentiable approximation to domain-specific inference mechanisms. Neural-Symbolic (NeSy) AI divides the end-to-end pipeline into neural perception and symbolic reasoning, which can directly exploit general domain knowledge such as algorithms and logic rules. However, it suffers from the exponential computational complexity caused by the interface between the two components, where the neural model lacks direct supervision, and the symbolic model lacks accurate input facts. As a result, they usually focus on learning the neural model with a sound and complete symbolic knowledge base while avoiding a crucial problem: where does the knowledge come from? In this paper, we present Abductive Meta-Interpretive Learning ($Meta_{Abd}$), which unites abduction and induction to learn perceptual neural network and first-order logic theories simultaneously from raw data. Given the same amount of domain knowledge, we demonstrate that $Meta_{Abd}$ not only outperforms the compared end-to-end models in predictive accuracy and data efficiency but also induces logic programs that can be re-used as background knowledge in subsequent learning tasks. To the best of our knowledge, $Meta_{Abd}$ is the first system that can jointly learn neural networks and recursive first-order logic theories with predicate invention.
Applying Inductive Logic Programming to Predicting Gene Function
One of the fastest advancing areas of modern science is functional genomics. This science seeks to understand how the complete complement of molecular components of living organisms (nucleic acid, protein, small molecules, and so on) interact together to form living organisms. Functional genomics is of interest to AI because the relationship between machines and living organisms is central to AI and because the field is an instructive and fun domain to apply and sharpen AI tools and ideas, requiring complex knowledge representation, reasoning, learning, and so on. This article describes two machine learning (inductive logic programming [ILP])-based approaches to the bioinformatic problem of predicting protein function from amino acid sequence. The first approach is based on using ILP as a way of bootstrapping from conventional sequence-based homology methods.