PIE is a Prolog-embedded environment for automated reasoning on the basis of first-order logic. Its main focus is on formulas, as constituents of complex formalizations that are structured through formula macros, and as outputs of reasoning tasks such as second-order quantifier elimination and Craig interpolation. It supports a workflow based on documents that intersperse macro definitions, invocations of reasoners, and LaTeX-formatted natural language text. Starting from various examples, the paper discusses features and application possibilities of PIE along with current limitations and issues for future research.

We consider the task of automated theorem proving, a key AI task. Deep learning has shown promise for training theorem provers, but there are limited human-written theorems and proofs available for supervised learning. To address this limitation, we propose to learn a neural generator that automatically synthesizes theorems and proofs for the purpose of training a theorem prover. Experiments on real-world tasks demonstrate that synthetic data from our approach improves the theorem prover and advances the state of the art of automated theorem proving in Metamath.

Kaliszyk, Cezary, Urban, Josef, Michalewski, Henryk, Olšák, Miroslav

We introduce a theorem proving algorithm that uses practically no domain heuristics for guiding its connection-style proof search. Instead, it runs many Monte-Carlo simulations guided by reinforcement learning from previous proof attempts. We produce several versions of the prover, parameterized by different learning and guiding algorithms. The strongest version of the system is trained on a large corpus of mathematical problems and evaluated on previously unseen problems. The trained system solves within the same number of inferences over 40% more problems than a baseline prover, which is an unusually high improvement in this hard AI domain.

I argue that the most interesting goal facing researchers in automated reasoning is being able to solve problems that cannot currently be solved by existing tools and methods. This may appear obvious, and is clearly not an original thought, but focusing on this as a primary goal allows us to examine other goals in a new light. Many successful theorem provers employ a portfolio of different methods for solving problems. This changes the landscape on which we perform our research: solving problems that can already be solved may not improve the state of the art and a method that can solve a handful of problems unsolvable by current methods, but generally performs poorly on most problems, can be very useful. We acknowledge that forcing new methods to compete against portfolio solvers can stifle innovation. However, this is only the case when comparisons are made at the level of total problems solved. We propose a movement towards focussing on unique solutions in evaluation and competitions i.e. measuring the potential contribution to a portfolio solver. This state of affairs is particularly prominent in first-order logic, which is undecidable. When reasoning in a decidable logic there can be a focus on optimising a decision procedure and measuring average solving times. But in a setting where solutions are difficult to find, average solving times lose meaning, and whilst improving the efficiency of a technique can move potential solutions within acceptable time limits, in general, complementary strategies may be more successful.

Olšák, Miroslav, Kaliszyk, Cezary, Urban, Josef

Automated reasoning and theorem proving have recently become major challenges for machine learning. In other domains, representations that are able to abstract over unimportant transformations, such as abstraction over translations and rotations in vision, are becoming more common. Standard methods of embedding mathematical formulas for learning theorem proving are however yet unable to handle many important transformations. In particular, embedding previously unseen labels, that often arise in definitional encodings and in Skolemization, has been very weak so far. Similar problems appear when transferring knowledge between known symbols. We propose a novel encoding of formulas that extends existing graph neural network models. This encoding represents symbols only by nodes in the graph, without giving the network any knowledge of the original labels. We provide additional links between such nodes that allow the network to recover the meaning and therefore correctly embed such nodes irrespective of the given labels. We test the proposed encoding in an automated theorem prover based on the tableaux connection calculus, and show that it improves on the best characterizations used so far. The encoding is further evaluated on the premise selection task and a newly introduced symbol guessing task, and shown to correctly predict 65% of the symbol names.

PIE is a Prolog-embedded environment for automated reasoning on the basis of first-order logic. It includes a versatile formula macro system and supports the creation of documents that intersperse macro definitions, reasoner invocations and LaTeX-formatted natural language text. Invocation of various reasoners is supported: External provers as well as sub-systems of PIE, which include preprocessors, a Prolog-based first-order prover, methods for Craig interpolation and methods for second-order quantifier elimination.

Steen, Alexander, Benzmüller, Christoph

Leo-III is an automated theorem prover for extensional type theory with Henkin semantics and choice. Reasoning with primitive equality is enabled by adapting paramodulation-based proof search to higher-order logic. The prover may cooperate with multiple external specialist reasoning systems such as first-order provers and SMT solvers. Leo-III is compatible with the TPTP/TSTP framework for input formats, reporting results and proofs, and standardized communication between reasoning systems, enabling e.g. proof reconstruction from within proof assistants such as Isabelle/HOL. Leo-III supports reasoning in polymorphic first-order and higher-order logic, in all normal quantified modal logics, as well as in different deontic logics. Its development had initiated the ongoing extension of the TPTP infrastructure to reasoning within non-classical logics.

"Theorem proving is similar to the game of Go. So, we can probably improve our provers using deep learning, like DeepMind built the super-human computer Go program, AlphaGo." Such optimism has been observed among participants of AITP2017. But is theorem proving really similar to Go? In this paper, we first identify the similarities and differences between them and then propose a system in which various provers keep competing against each other and changing themselves until they prove conjectures provided by users.

Fawzi, Alhussein, Malinowski, Mateusz, Fawzi, Hamza, Fawzi, Omar

Polynomial inequalities lie at the heart of many mathematical disciplines. In this paper, we consider the fundamental computational task of automatically searching for proofs of polynomial inequalities. We adopt the framework of semi-algebraic proof systems that manipulate polynomial inequalities via elementary inference rules that infer new inequalities from the premises. These proof systems are known to be very powerful, but searching for proofs remains a major difficulty. In this work, we introduce a machine learning based method to search for a dynamic proof within these proof systems. We propose a deep reinforcement learning framework that learns an embedding of the polynomials and guides the choice of inference rules, taking the inherent symmetries of the problem as an inductive bias. We compare our approach with powerful and widely-studied linear programming hierarchies based on static proof systems, and show that our method reduces the size of the linear program by several orders of magnitude while also improving performance. These results hence pave the way towards augmenting powerful and well-studied semi-algebraic proof systems with machine learning guiding strategies for enhancing the expressivity of such proof systems.

Zombori, Zsolt, Csiszárik, Adrián, Michalewski, Henryk, Kaliszyk, Cezary, Urban, Josef

We present a reinforcement learning (RL) based guidance system for automated theorem proving geared towards Finding Longer Proofs (FLoP). FLoP focuses on generalizing from short proofs to longer ones of similar structure. To achieve that, FLoP uses state-of-the-art RL approaches that were previously not applied in theorem proving. In particular, we show that curriculum learning significantly outperforms previous learning-based proof guidance on a synthetic dataset of increasingly difficult arithmetic problems.