Logic & Formal Reasoning
The use of knowledge in open-ended systems
Devereaux, Abigail, Koppl, Roger
Economists model knowledge use and acquisition as a cause-and-effect calculus associating observations made by a decision-maker about their world with possible underlying causes. Knowledge models are well-established for static contexts, but not for contexts of innovative and unbounded change. We develop a representation of knowledge use and acquisition in open-ended evolutionary systems and demonstrate its primary results, including that observers embedded in open-ended evolutionary systems can agree to disagree and that their ability to theorize about their systems is fundamentally local and constrained to their frame of reference what we call frame relativity. The results of our framework formalize local knowledge use, the many-selves interpretation of reasoning through time, and motivate the emergence of nonlogical modes of reasoning like institutional and aesthetic codes.
How To Discover Short, Shorter, and the Shortest Proofs of Unsatisfiability: A Branch-and-Bound Approach for Resolution Proof Length Minimization
Sidorov, Konstantin, van der Linden, Koos, Correia, Gonรงalo Homem de Almeida, de Weerdt, Mathijs, Demiroviฤ, Emir
Modern software for propositional satisfiability problems gives a powerful automated reasoning toolkit, capable of outputting not only a satisfiable/unsatisfiable signal but also a justification of unsatisfiability in the form of resolution proof (or a more expressive proof), which is commonly used for verification purposes. Empirically, modern SAT solvers produce relatively short proofs, however, there are no inherent guarantees that these proofs cannot be significantly reduced. This paper proposes a novel branch-and-bound algorithm for finding the shortest resolution proofs; to this end, we introduce a layer list representation of proofs that groups clauses by their level of indirection. As we show, this representation breaks all permutational symmetries, thereby improving upon the state-of-the-art symmetry-breaking and informing the design of a novel workflow for proof minimization. In addition to that, we design pruning procedures that reason on proof length lower bound, clause subsumption, and dominance. Our experiments suggest that the proofs from state-of-the-art solvers could be shortened by 30-60% on the instances from SAT Competition 2002 and by 25-50% on small synthetic formulas. When treated as an algorithm for finding the shortest proof, our approach solves twice as many instances as the previous work based on SAT solving and reduces the time to optimality by orders of magnitude for the instances solved by both approaches.
Automated Strategy Invention for Confluence of Term Rewrite Systems
Zhang, Liao, Mitterwallner, Fabian, Jakubuv, Jan, Kaliszyk, Cezary
Term rewriting plays a crucial role in software verification and compiler optimization. With dozens of highly parameterizable techniques developed to prove various system properties, automatic term rewriting tools work in an extensive parameter space. This complexity exceeds human capacity for parameter selection, motivating an investigation into automated strategy invention. In this paper, we focus on confluence, an important property of term rewrite systems, and apply machine learning to develop the first learning-guided automatic confluence prover. Moreover, we randomly generate a large dataset to analyze confluence for term rewrite systems. Our results focus on improving the state-of-the-art automatic confluence prover CSI: When equipped with our invented strategies, it surpasses its human-designed strategies both on the augmented dataset and on the original human-created benchmark dataset Cops, proving/disproving the confluence of several term rewrite systems for which no automated proofs were known before.
Querying Perception Streams with Spatial Regular Expressions
Anderson, Jacob, Fainekos, Georgios, Hoxha, Bardh, Okamoto, Hideki, Prokhorov, Danil
Perception in fields like robotics, manufacturing, and data analysis generates large volumes of temporal and spatial data to effectively capture their environments. However, sorting through this data for specific scenarios is a meticulous and error-prone process, often dependent on the application, and lacks generality and reproducibility. In this work, we introduce SpREs as a novel querying language for pattern matching over perception streams containing spatial and temporal data derived from multi-modal dynamic environments. To highlight the capabilities of SpREs, we developed the STREM tool as both an offline and online pattern matching framework for perception data. We demonstrate the offline capabilities of STREM through a case study on a publicly available AV dataset (Woven Planet Perception) and its online capabilities through a case study integrating STREM in ROS with the CARLA simulator. We also conduct performance benchmark experiments on various SpRE queries. Using our matching framework, we are able to find over 20,000 matches within 296 ms making STREM applicable in runtime monitoring applications.
Bridging the Gap: Representation Spaces in Neuro-Symbolic AI
However, although the cooperation between these two seems natural, the difference in their representation is obviously not negligible. Prof. Henry Kautz proposed a taxonomy of Neuro-Symbolic Systems in the AAAI 2020. In addition, many researchers have conducted relevant reviews of the recent neuro-symbolic AI from different perspectives. As Fig.1 shows, Acharya et al. [1] proposed a new classification method, which classified and discussed the application of existing neuro-symbolic AI by the role of neural and symbolic parts: learning for reasoning, reasoning for Learning, and learning-reasoning. Garcez et al. [73] proposed a taxonomy that includes sequential, nested, cooperative, and compiled neuro-symbolic AI based on the six types introduced by Henry Kautz. In addition, some reviews focus on cross-field integration and applications. For example, Berlot-Attwell [27] reviewed neuro-symbolic VQA (visual question answering) from the perspectives of AGI (artificial general intelligence) desiderata. Marra [128] conducted a comprehensive review on integrating neuro-symbolic and statistical relational artificial intelligence based on seven dimensions.
ION-C: Integration of Overlapping Networks via Constraints
Nair, Praveen, Bhandari, Payal, Abavisani, Mohammadsajad, Plis, Sergey, Danks, David
In many causal learning problems, variables of interest are often not all measured over the same observations, but are instead distributed across multiple datasets with overlapping variables. Tillman et al. (2008) presented the first algorithm for enumerating the minimal equivalence class of ground-truth DAGs consistent with all input graphs by exploiting local independence relations, called ION. In this paper, this problem is formulated as a more computationally efficient answer set programming (ASP) problem, which we call ION-C, and solved with the ASP system clingo. The ION-C algorithm was run on random synthetic graphs with varying sizes, densities, and degrees of overlap between subgraphs, with overlap having the largest impact on runtime, number of solution graphs, and agreement within the output set. To validate ION-C on real-world data, we ran the algorithm on overlapping graphs learned from data from two successive iterations of the European Social Survey (ESS), using a procedure for conducting joint independence tests to prevent inconsistencies in the input.
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically
Dong, Kefan, Mahankali, Arvind, Ma, Tengyu
Mathematical theorem proving is an important testbed for large language models' deep and abstract reasoning capability. This paper focuses on improving LLMs' ability to write proofs in formal languages that permit automated proof verification/evaluation. Most previous results provide human-written lemmas to the theorem prover, which is an arguably oversimplified setting that does not sufficiently test the provers' planning and decomposition capabilities. Instead, we work in a more natural setup where the lemmas that are directly relevant to the theorem are not given to the theorem prover at test time. We design an RL-based training algorithm that encourages the model to decompose a theorem into lemmas, prove the lemmas, and then prove the theorem by using the lemmas. Our reward mechanism is inspired by how mathematicians train themselves: even if a theorem is too challenging to be proved by the current model, a positive reward is still given to the model for any correct and novel lemmas that are proposed and proved in this process. During training, our model proposes and proves lemmas that are not in the training dataset. In fact, these newly-proposed correct lemmas consist of 37.7% of the training replay buffer when we train on the dataset extracted from Archive of Formal Proofs (AFP). The model trained by our RL algorithm outperforms that trained by supervised finetuning, improving the pass rate from 40.8% to 45.5% on AFP test set, and from 36.5% to 39.5% on an out-of-distribution test set.
Learning Rules Explaining Interactive Theorem Proving Tactic Prediction
Zhang, Liao, Cerna, David M., Kaliszyk, Cezary
Formally verifying the correctness of mathematical proofs is more accessible than ever, however, the learning curve remains steep for many of the state-of-the-art interactive theorem provers (ITP). Deriving the most appropriate subsequent proof step, and reasoning about it, given the multitude of possibilities, remains a daunting task for novice users. To improve the situation, several investigations have developed machine learning based guidance for tactic selection. Such approaches struggle to learn non-trivial relationships between the chosen tactic and the structure of the proof state and represent them as symbolic expressions. To address these issues we (i) We represent the problem as an Inductive Logic Programming (ILP) task, (ii) Using the ILP representation we enriched the feature space by encoding additional, computationally expensive properties as background knowledge predicates, (iii) We use this enriched feature space to learn rules explaining when a tactic is applicable to a given proof state, (iv) We use the learned rules to filter the output of an existing tactic selection approach and empirically show improvement over the non-filtering approaches.
LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban Simulation
Li, Bowen, Li, Zhaoyu, Du, Qiwei, Luo, Jinqi, Wang, Wenshan, Xie, Yaqi, Stepputtis, Simon, Wang, Chen, Sycara, Katia P., Ravikumar, Pradeep Kumar, Gray, Alexander G., Si, Xujie, Scherer, Sebastian
Recent years have witnessed the rapid development of Neuro-Symbolic (NeSy) AI systems, which integrate symbolic reasoning into deep neural networks. However, most of the existing benchmarks for NeSy AI fail to provide long-horizon reasoning tasks with complex multi-agent interactions. Furthermore, they are usually constrained by fixed and simplistic logical rules over limited entities, making them far from real-world complexities. To address these crucial gaps, we introduce LogiCity, the first simulator based on customizable first-order logic (FOL) for an urban-like environment with multiple dynamic agents. LogiCity models diverse urban elements using semantic and spatial concepts, such as IsAmbulance(X) and IsClose(X, Y). These concepts are used to define FOL rules that govern the behavior of various agents. Since the concepts and rules are abstractions, they can be universally applied to cities with any agent compositions, facilitating the instantiation of diverse scenarios. Besides, a key feature of LogiCity is its support for user-configurable abstractions, enabling customizable simulation complexities for logical reasoning. To explore various aspects of NeSy AI, LogiCity introduces two tasks, one features long-horizon sequential decision-making, and the other focuses on one-step visual reasoning, varying in difficulty and agent behaviors. Our extensive evaluation reveals the advantage of NeSy frameworks in abstract reasoning. Moreover, we highlight the significant challenges of handling more complex abstractions in long-horizon multi-agent scenarios or under high-dimensional, imbalanced data. With its flexible design, various features, and newly raised challenges, we believe LogiCity represents a pivotal step forward in advancing the next generation of NeSy AI. All the code and data are open-sourced at our website.
Proportionality and Strategyproofness in Multiwinner Elections
Multiwinner voting rules can be used to select a fixed-size committee from a larger set of candidates. We consider approval-based committee rules, which allow voters to approve or disapprove candidates. In this setting, several voting rules such as Proportional Approval Voting (PAV) and Phragm\'en's rules have been shown to produce committees that are proportional, in the sense that they proportionally represent voters' preferences; all of these rules are strategically manipulable by voters. On the other hand, a generalisation of Approval Voting gives a non-proportional but strategyproof voting rule. We show that there is a fundamental tradeoff between these two properties: we prove that no multiwinner voting rule can simultaneously satisfy a weak form of proportionality (a weakening of justified representation) and a weak form of strategyproofness. Our impossibility is obtained using a formulation of the problem in propositional logic and applying SAT solvers; a human-readable version of the computer-generated proof is obtained by extracting a minimal unsatisfiable set (MUS). We also discuss several related axiomatic questions in the domain of committee elections.