Modern SAT solvers routinely operate at scales that make it impractical to query a neural network for every branching decision. NeuroCore, proposed by Selsam and Bjorner, offered a proof-of-concept that neural networks can still accelerate SAT solvers by only periodically refocusing a score-based branching heuristic. However, that work suffered from several limitations: their modified solvers require GPU acceleration, further ablations showed that they were no better than a random baseline on the SATCOMP 2018 benchmark, and their training target of unsat cores required an expensive data pipeline which only labels relatively easy unsatisfiable problems. We address all these limitations, using a simpler network architecture allowing CPU inference for even large industrial problems with millions of clauses, and training instead to predict {\em glue variables}---a target for which it is easier to generate labelled data, and which can also be formulated as a reinforcement learning task. We demonstrate the effectiveness of our approach by modifying the state-of-the-art SAT solver CaDiCaL, improving its performance on SATCOMP 2018 and SATRACE 2019 with supervised learning and its performance on a dataset of SHA-1 preimage attacks with reinforcement learning.

Ozolins, Emils, Freivalds, Karlis, Draguns, Andis, Gaile, Eliza, Zakovskis, Ronalds, Kozlovics, Sergejs

Modern neural networks obtain information about the problem and calculate the output solely from the input values. We argue that it is not always optimal, and the network's performance can be significantly improved by augmenting it with a query mechanism that allows the network to make several solution trials at run time and get feedback on the loss value on each trial. To demonstrate the capabilities of the query mechanism, we formulate an unsupervised (not dependant on labels) loss function for Boolean Satisfiability Problem (SAT) and theoretically show that it allows the network to extract rich information about the problem. We then propose a neural SAT solver with a query mechanism called QuerySAT and show that it outperforms the neural baseline on a wide range of SAT tasks and the classical baselines on SHA-1 preimage attack and 3-SAT task.

Liang, Jia Hui (University of Waterloo) | Ganesh, Vijay (University of Waterloo) | Poupart, Pascal (University of Waterloo) | Czarnecki, Krzysztof (University of Waterloo)

Modern conflict-driven clause-learning SAT solvers routinely solve large real-world instances with millions of clauses and variables in them. Their success crucially depends on effective branching heuristics. In this paper, we propose a new branching heuristic inspired by the exponential recency weighted average algorithm used to solve the bandit problem. The branching heuristic, we call CHB, learns online which variables to branch on by leveraging the feedback received from conflict analysis. We evaluated CHB on 1200 instances from the SAT Competition 2013 and 2014 instances, and showed that CHB solves significantly more instances than VSIDS, currently the most effective branching heuristic in widespread use. More precisely, we implemented CHB as part of the MiniSat and Glucose solvers, and performed an apple-to-apple comparison with their VSIDS-based variants. CHB-based MiniSat (resp. CHB-based Glucose) solved approximately 16.1% (resp. 5.6%) more instances than their VSIDS-based variants. Additionally, CHB-based solvers are much more efficient at constructing first preimage attacks on step-reduced SHA-1 and MD5 cryptographic hash functions, than their VSIDS-based counterparts. To the best of our knowledge, CHB is the first branching heuristic to solve significantly more instances than VSIDS on a large, diverse benchmark of real-world instances.

Kurin, Vitaly, Godil, Saad, Whiteson, Shimon, Catanzaro, Bryan

A BSTRACT We present GQSA T, a branching heuristic in a Boolean SA T solver trained with value-based reinforcement learning (RL) using Graph Neural Networks for function approximation. Solvers using GQSA T are complete SA T solvers that either provide a satisfying assignment or a proof of unsatisfiability, which is required for many SA T applications. The branching heuristic commonly used in SA T solvers today suffers from bad decisions during their warm-up period, whereas GQSA T has been trained to examine the structure of the particular problem instance to make better decisions at the beginning of the search. Training GQSA T is data efficient and does not require elaborate dataset preparation or feature engineering to train. We train GQSA T on small SA T problems using RL interfacing with an existing SA T solver. We show that GQSA T is able to reduce the number of iterations required to solve SA T problems by 2-3X, and it generalizes to unsatisfiable SA T instances, as well as to problems with 5X more variables than it was trained on. We also show that, to a lesser extent, it generalizes to SA T problems from different domains by evaluating it on graph coloring. Our experiments show that augmenting SA T solvers with agents trained with RL and graph neural networks can improve performance on the SA T search problem. 1 I NTRODUCTION Boolean satisfiability (SA T) is an important problem for both industry and academia impacting various fields, including circuit design, computer security, artificial intelligence, automatic theorem proving, and combinatorial optimization. As a result, modern SA T solvers are well-crafted, sophisticated, reliable pieces of software that can scale to problems with hundreds of thousands of variables (Ohrimenko et al., 2009). SA T is known to be NPcomplete (Karp, 1972), and most state-of-the-art open-source and commercial solvers rely on multiple heuristics to speed up the exhaustive search, which is otherwise intractable. These heuristics are usually meticulously crafted using expert domain knowledge and are often iterated on using trial and error. In this paper, we investigate how we can use machine learning to improve upon an existing branching heuristic without leveraging domain expertise.

Lederman, Gil, Rabe, Markus N., Seshia, Sanjit A.

We demonstrate how to learn efficient heuristics for automated reasoning algorithms through deep reinforcement learning. We consider search algorithms for quantified Boolean logics, that already can solve formulas of impressive size - up to 100s of thousands of variables. The main challenge is to find a representation which lends to making predictions in a scalable way. The heuristics learned through our approach significantly improve over the handwritten heuristics for several sets of formulas.