Logic & Formal Reasoning
The informal semantics of Answer Set Programming: A Tarskian perspective
Denecker, Marc, Lierler, Yuliya, truszczynski, Miroslaw, Vennekens, Joost
In Knowledge Representation, it is crucial that knowledge engineers have a good understanding of the formal expressions that they write. What formal expressions state intuitively about the domain of discourse is studied in the theory of the informal semantics of a logic. In this paper we study the informal semantics of Answer Set Programming. The roots of answer set programming lie in the language of Extended Logic Programming, which was introduced initially as an epistemic logic for default and autoepistemic reasoning. In 1999, the seminal papers on answer set programming proposed to use this logic for a different purpose, namely, to model and solve search problems. Currently, the language is used primarily in this new role. However, the original epistemic intuitions lose their explanatory relevance in this new context. How answer set programs are connected to the specifications of problems they model is more easily explained in a classical Tarskian semantics, in which models correspond to possible worlds, rather than to belief states of an epistemic agent. In this paper, we develop a new theory of the informal semantics of answer set programming, which is formulated in the Tarskian setting and based on Frege's compositionality principle. It differs substantially from the earlier epistemic theory of informal semantics, providing a different view on the meaning of the connectives in answer set programming and on its relation to other logics, in particular classical logic.
On Inductive Abilities of Latent Factor Models for Relational Learning
Trouillon, Théo, Gaussier, Eric, Dance, Christopher R., Bouchard, Guillaume
Latent factor models are increasingly popular for modeling multi-relational knowledge graphs. By their vectorial nature, it is not only hard to interpret why this class of models works so well, but also to understand where they fail and how they might be improved. We conduct an experimental survey of state-of-the-art models, not towards a purely comparative end, but as a means to get insight about their inductive abilities. To assess the strengths and weaknesses of each model, we create simple tasks that exhibit first, atomic properties of binary relations, and then, common inter-relational inference through synthetic genealogies. Based on these experimental results, we propose new research directions to improve on existing models.
High-Fidelity Vector Space Models of Structured Data
Crouse, Maxwell, Fokoue, Achille, Chang, Maria, Kapanipathi, Pavan, Musa, Ryan, Nakos, Constantine, Wu, Lingfei, Forbus, Kenneth, Witbrock, Michael
Machine learning systems regularly deal with structured data in real-world applications. Unfortunately, such data has been difficult to faithfully represent in a way that most machine learning techniques would expect, i.e. as a real-valued vector of a fixed, pre-specified size. In this work, we introduce a novel approach that compiles structured data into a satisfiability problem which has in its set of solutions at least (and often only) the input data. The satisfiability problem is constructed from constraints which are generated automatically a priori from a given signature, thus trivially allowing for a bag-of-words-esque vector representation of the input to be constructed. The method is demonstrated in two areas, automated reasoning and natural language processing, where it is shown to produce vector representations of natural-language sentences and first-order logic clauses that can be precisely translated back to their original, structured input forms.
Optimizing Answer Set Computation via Heuristic-Based Decomposition
Calimeri, Francesco, Perri, Simona, Zangari, Jessica
Answer Set Programming (ASP) is a purely declarative formalism developed in the field of logic programming and nonmonotonic reasoning: computational problems are encoded by logic programs whose answer sets, corresponding to solutions, are computed by an ASP system. Different, semantically equivalent, programs can be defined for the same problem; however, performance of systems evaluating them might significantly vary. We propose an approach for automatically transforming an input logic program into an equivalent one that can be evaluated more efficiently. One can make use of existing tree-decomposition techniques for rewriting selected rules into a set of multiple ones; the idea is to guide and adaptively apply them on the basis of proper new heuristics, to obtain a smart rewriting algorithm to be integrated into an ASP system. The method is rather general: it can be adapted to any system and implement different preference policies. Furthermore, we define a set of new heuristics tailored at optimizing grounding, one of the main phases of the ASP computation; we use them in order to implement the approach into the ASP system DLV, in particular into its grounding subsystem I-DLV, and carry out an extensive experimental activity for assessing the impact of the proposal. Under consideration in Theory and Practice of Logic Programming (TPLP).
Game Semantics and Linear Logic in the Cognition Process
A description of the environment cognition process by intelligent systems with a fixed set of system goals is suggested. Such a system is represented by the set of its goals only without any models of the system elements or the environment. The set has a lattice structure and a monoid structure; thus, the structure of linear logic is defined on the set. The cognition process of some environment by the system is described on this basis. The environment is represented as a configuration space of possible system positions which are estimated by an information amount (by corresponding sets). This information is supplied to the system by the environment. Thus, it is possible to define the category of Conway games with a payoff on the configuration space and to choose an optimal system's play (i.e., a trajectory). The choice is determined by the requirement of maximal information increasing and takes into account the structure of the system goal set: the linear logic on the set is used to determine the priority of possible different parallel processes. The survey may be useful to describe the behavior of robots and simple biological systems, e.g., ants.
Learning Loop Invariants for Program Verification
Si, Xujie, Dai, Hanjun, Raghothaman, Mukund, Naik, Mayur, Song, Le
A fundamental problem in program verification concerns inferring loop invariants. The problem is undecidable and even practical instances are challenging. Inspired by how human experts construct loop invariants, we propose a reasoning framework Code2Inv that constructs the solution by multi-step decision making and querying an external program graph memory block. By training with reinforcement learning, Code2Inv captures rich program features and avoids the need for ground truth solutions as supervision. Compared to previous learning tasks in domains with graph-structured data, it addresses unique challenges, such as a binary objective function and an extremely sparse reward that is given by an automated theorem prover only after the complete loop invariant is proposed. We evaluate Code2Inv on a suite of 133 benchmark problems and compare it to three state-of-the-art systems. It solves 106 problems compared to 73 by a stochastic search-based system, 77 by a heuristic search-based system, and 100 by a decision tree learning-based system. Moreover, the strategy learned can be generalized to new programs: compared to solving new instances from scratch, the pre-trained agent is more sample efficient in finding solutions.
Improving Neural Program Synthesis with Inferred Execution Traces
Shin, Richard, Polosukhin, Illia, Song, Dawn
The task of program synthesis, or automatically generating programs that are consistent with a provided specification, remains a challenging task in artificial intelligence. As in other fields of AI, deep learning-based end-to-end approaches have made great advances in program synthesis. However, more so than other fields such as computer vision, program synthesis provides greater opportunities to explicitly exploit structured information such as execution traces, which contain a superset of the information input/output pairs. While they are highly useful for program synthesis, as execution traces are more difficult to obtain than input/output pairs, we use the insight that we can split the process into two parts: infer the trace from the input/output example, then infer the program from the trace. This simple modification leads to state-of-the-art results in program synthesis in the Karel domain, improving accuracy to 81.3% from the 77.12% of prior work.
Automatic Program Synthesis of Long Programs with a Learned Garbage Collector
We consider the problem of generating automatic code given sample input-output pairs. We train a neural network to map from the current state and the outputs to the program's next statement. The neural network optimizes multiple tasks concurrently: the next operation out of a set of high level commands, the operands of the next statement, and which variables can be dropped from memory. Using our method we are able to create programs that are more than twice as long as existing state-of-the-art solutions, while improving the success rate for comparable lengths, and cutting the run-time by two orders of magnitude. Our code, including an implementation of various literature baselines, is publicly available at https: //github.com/amitz25/PCCoder
Reinforcement Learning of Theorem Proving
Kaliszyk, Cezary, Urban, Josef, Michalewski, Henryk, Olšák, Miroslav
We introduce a theorem proving algorithm that uses practically no domain heuristics for guiding its connection-style proof search. Instead, it runs many Monte-Carlo simulations guided by reinforcement learning from previous proof attempts. We produce several versions of the prover, parameterized by different learning and guiding algorithms. The strongest version of the system is trained on a large corpus of mathematical problems and evaluated on previously unseen problems. The trained system solves within the same number of inferences over 40% more problems than a baseline prover, which is an unusually high improvement in this hard AI domain. To our knowledge this is the first time reinforcement learning has been convincingly applied to solving general mathematical problems on a large scale.
Learning Pipelines with Limited Data and Domain Knowledge: A Study in Parsing Physics Problems
Sachan, Mrinmaya, Dubey, Kumar Avinava, Mitchell, Tom M., Roth, Dan, Xing, Eric P.
As machine learning becomes more widely used in practice, we need new methods to build complex intelligent systems that integrate learning with existing software, and with domain knowledge encoded as rules. As a case study, we present such a system that learns to parse Newtonian physics problems in textbooks. This system, Nuts&Bolts, learns a pipeline process that incorporates existing code, pre-learned machine learning models, and human engineered rules. It jointly trains the entire pipeline to prevent propagation of errors, using a combination of labelled and unlabelled data. Our approach achieves a good performance on the parsing task, outperforming the simple pipeline and its variants. Finally, we also show how Nuts&Bolts can be used to achieve improvements on a relation extraction task and on the end task of answering Newtonian physics problems.