Goto

Collaborating Authors

 Logic & Formal Reasoning


Measuring Mathematical Problem Solving With the MATH Dataset

arXiv.org Artificial Intelligence

Many intellectual endeavors require mathematical problem solving, but this skill remains beyond the capabilities of computers. To measure this ability in machine learning models, we introduce MATH, a new dataset of 12, 500 challenging competition mathematics problems. Each problem in MATH has a full step-by-step solution which can be used to teach models to generate answer derivations and explanations. To facilitate future research and increase accuracy on MATH, we also contribute a large auxiliary pretraining dataset which helps teach models the fundamentals of mathematics. Even though we are able to increase accuracy on MATH, our results show that accuracy remains relatively low, even with enormous Transformer models. Moreover, we find that simply increasing budgets and model parameter counts will be impractical for achieving strong mathematical reasoning if scaling trends continue. While scaling Transformers is automatically solving most other text-based tasks, scaling is not currently solving MATH. To have more traction on mathematical problem solving we will likely need new algorithmic advancements from the broader research community.


Training a First-Order Theorem Prover from Synthetic Data

arXiv.org Artificial Intelligence

A major challenge in applying machine learning to automated theorem proving is the scarcity of training data, which is a key ingredient in training successful deep learning models. To tackle this problem, we propose an approach that relies on training purely with synthetically generated theorems, without any human data aside from axioms. We use these theorems to train a neurally-guided saturationbased prover. Our neural prover outperforms the state-of-the-art E-prover on this synthetic data in both time and search steps, and shows significant transfer to the unseen human-written theorems from the TPTP library, where it solves 72% of first-order problems without equality. Most work applying machine learning to theorem proving takes the following approach: 1) pick a dataset of formalized mathematics, such as Mizar or Metamath, or the standard library of a major proof assistant such as HOL-Light or Coq; 2) split the dataset into train and test; 3) use imitation learning or reinforcement learning on the training set to learn a policy; and finally 4) evaluate the policy on the test set (Loos et al. (2017), Bansal et al. (2019), Yang & Deng (2019), Han et al. (2021), Polu & Sutskever (2020)). Such methods are fundamentally limited by the size of the training set, particularly when relying on deep neural networks (Kaplan et al., 2020). Unfortunately, unlike in computer vision and natural language processing, theorem proving datasets are comparatively tiny.


Differentiable Inductive Logic Programming for Structured Examples

arXiv.org Artificial Intelligence

The differentiable implementation of logic yields a seamless combination of symbolic reasoning and deep neural networks. Recent research, which has developed a differentiable framework to learn logic programs from examples, can even acquire reasonable solutions from noisy datasets. However, this framework severely limits expressions for solutions, e.g., no function symbols are allowed, and the shapes of clauses are fixed. As a result, the framework cannot deal with structured examples. Therefore we propose a new framework to learn logic programs from noisy and structured examples, including the following contributions. First, we propose an adaptive clause search method by looking through structured space, which is defined by the generality of the clauses, to yield an efficient search space for differentiable solvers. Second, we propose for ground atoms an enumeration algorithm, which determines a necessary and sufficient set of ground atoms to perform differentiable inference functions. Finally, we propose a new method to compose logic programs softly, enabling the system to deal with complex programs consisting of several clauses. Our experiments show that our new framework can learn logic programs from noisy and structured examples, such as sequences or trees. Our framework can be scaled to deal with complex programs that consist of several clauses with function symbols.


Logic Embeddings for Complex Query Answering

arXiv.org Artificial Intelligence

Answering logical queries over incomplete knowledge bases is challenging because: 1) it calls for implicit link prediction, and 2) brute force answering of existential first-order logic queries is exponential in the number of existential variables. Recent work of query embeddings provides fast querying, but most approaches model set logic with closed regions, so lack negation. Query embeddings that do support negation use densities that suffer drawbacks: 1) only improvise logic, 2) use expensive distributions, and 3) poorly model answer uncertainty. In this paper, we propose Logic Embeddings, a new approach to embedding complex queries that uses Skolemisation to eliminate existential variables for efficient querying. It supports negation, but improves on density approaches: 1) integrates well-studied t-norm logic and directly evaluates satisfiability, 2) simplifies modeling with truth values, and 3) models uncertainty with truth bounds. Logic Embeddings are competitively fast and accurate in query answering over large, incomplete knowledge graphs, outperform on negation queries, and in particular, provide improved modeling of answer uncertainty as evidenced by a superior correlation between answer set size and embedding entropy.


New Techniques that Improve ENIGMA-style Clause Selection Guidance

arXiv.org Artificial Intelligence

We re-examine the topic of machine-learned clause selection guidance in saturation-based theorem provers. The central idea, recently popularized by the ENIGMA system, is to learn a classifier for recognizing clauses that appeared in previously discovered proofs. In subsequent runs, clauses classified positively are prioritized for selection. We propose several improvements to this approach and experimentally confirm their viability. For the demonstration, we use a Recursive Neural Network to classify clauses based on their derivation history and the presence or absence of automatically supplied theory axioms therein. The automatic theorem prover Vampire guided by the network achieves a 41 % improvement on a relevant subset of smt-lib in a real time evaluation.


Unfounded Sets for Disjunctive Hybrid MKNF Knowledge Bases

arXiv.org Artificial Intelligence

Combining the closed-world reasoning of answer set programming (ASP) with the open-world reasoning of ontologies broadens the space of applications of reasoners. Disjunctive hybrid MKNF knowledge bases succinctly extend ASP and in some cases without increasing the complexity of reasoning tasks. However, in many cases, solver development is lagging behind. As the result, the only known method of solving disjunctive hybrid MKNF knowledge bases is based on guess-and-verify, as formulated by Motik and Rosati in their original work. A main obstacle is understanding how constraint propagation may be performed by a solver, which, in the context of ASP, centers around the computation of \textit{unfounded atoms}, the atoms that are false given a partial interpretation. In this work, we build towards improving solvers for hybrid MKNF knowledge bases with disjunctive rules: We formalize a notion of unfounded sets for these knowledge bases, identify lower complexity bounds, and demonstrate how we might integrate these developments into a solver. We discuss challenges introduced by ontologies that are not present in the development of solvers for disjunctive logic programs, which warrant some deviations from traditional definitions of unfounded sets. We compare our work with prior definitions of unfounded sets.


Differentiable Logic Machines

arXiv.org Artificial Intelligence

The integration of reasoning, learning, and decision-making is key to build more general AI systems. As a step in this direction, we propose a novel neural-logic architecture that can solve both inductive logic programming (ILP) and deep reinforcement learning (RL) problems. Our architecture defines a restricted but expressive continuous space of first-order logic programs by assigning weights to predicates instead of rules. Therefore, it is fully differentiable and can be efficiently trained with gradient descent. Besides, in the deep RL setting with actor-critic algorithms, we propose a novel efficient critic architecture. Compared to state-of-the-art methods on both ILP and RL problems, our proposition achieves excellent performance, while being able to provide a fully interpretable solution and scaling much better, especially during the testing phase.


Knowledge Graphs

Communications of the ACM

The 1980s saw the evolution of computing as it transitioned from industry to homes through the boom of personal computers. In the field of data management, the Relational Database industry was developing rapidly (Oracle, Sybase, IBM, among others). Object-oriented abstractions were developed as a new form of representational independence. The Internet changed the way people communicated and exchanged information.


Edmund M. Clarke (1945–2020)

Communications of the ACM

Edmund Melson Clarke, Jr., a celebrated American academic who developed methods for mathematically proving the correctness of computer systems, died on December 22, 2020 at the age of 75 from complications of COVID-19. Clarke was awarded the A.M Turing Award in 2008 with his former student E. Allen Emerson and the French computer scientist Joseph Sifakis, for their work on model checking. "I've never liked to fly, although I've done my share of it. I wanted to do something that would make systems like airplanes safer," Clarke said in a 2014 video produced by the Franklin Institute when he was awarded their 2014 Bower Award and Prize for Achievement in Sciencea "For his leading role in the conception and development of techniques for automatically verifying the correctness of a broad array of computer systems, including those found in transportation, communications, and medicine." Model checking is a practical approach for machine verification of mathematical models of hardware, software, communications protocols, and other complex computing systems.


Program Synthesis Guided Reinforcement Learning

arXiv.org Artificial Intelligence

A key challenge for reinforcement learning is solving long-horizon planning and control problems. Recent work has proposed leveraging programs to help guide the learning algorithm in these settings. However, these approaches impose a high manual burden on the user since they must provide a guiding program for every new task they seek to achieve. We propose an approach that leverages program synthesis to automatically generate the guiding program. A key challenge is how to handle partially observable environments. We propose model predictive program synthesis, which trains a generative model to predict the unobserved portions of the world, and then synthesizes a program based on samples from this model in a way that is robust to its uncertainty. We evaluate our approach on a set of challenging benchmarks, including a 2D Minecraft-inspired ``craft'' environment where the agent must perform a complex sequence of subtasks to achieve its goal, a box-world environment that requires abstract reasoning, and a variant of the craft environment where the agent is a MuJoCo Ant. Our approach significantly outperforms several baselines, and performs essentially as well as an oracle that is given an effective program.