Goto

Collaborating Authors

 Logic & Formal Reasoning


Comparison of different Unique hard attention transformer models by the formal languages they can recognize

arXiv.org Artificial Intelligence

The goal of this note is to give an overview of the capabilities of different flavors of unique hard attention transformer encoders in terms of the formal languages they are able to recognize. This study is relevant in the context of the rising use of large language models, which typically follow a transformer architecture. While the model we will be primarily investigating has features very distinct from real-world transformers (we will comment on the distinction later) they can still give valuable insights into the principle underlying transformer capabilities. Roughly speaking, a transformer can be thought of function that, given an input of any length, can construct a sequence of the same length. It transforms one sequence into the other.


Towards Generating Controllable and Solvable Geometry Problem by Leveraging Symbolic Deduction Engine

arXiv.org Artificial Intelligence

Generating high-quality geometry problems is both an important and challenging task in education. Compared to math word problems, geometry problems further emphasize multi-modal formats and the translation between informal and formal languages. In this paper, we introduce a novel task for geometry problem generation and propose a new pipeline method: the Symbolic Deduction Engine-based Geometry Problem Generation framework (SDE-GPG). The framework leverages a symbolic deduction engine and contains four main steps: (1) searching a predefined mapping table from knowledge points to extended definitions, (2) sampling extended definitions and performing symbolic deduction, (3) filtering out unqualified problems, and (4) generating textual problems and diagrams. Specifically, our method supports to avoid inherent biases in translating natural language into formal language by designing the mapping table, and guarantees to control the generated problems in terms of knowledge points and difficulties by an elaborate checking function. With obtained formal problems, they are translated to natural language and the accompanying diagrams are automatically drew by rule-based methods. We conduct experiments using real-world combinations of knowledge points from two public datasets. The results demonstrate that the SDE-GPG can effectively generate readable, solvable and controllable geometry problems.


Thinking Out of the Box: Hybrid SAT Solving by Unconstrained Continuous Optimization

arXiv.org Artificial Intelligence

The Boolean satisfiability (SAT) problem lies at the core of many applications in combinatorial optimization, software verification, cryptography, and machine learning. While state-of-the-art solvers have demonstrated high efficiency in handling conjunctive normal form (CNF) formulas, numerous applications require non-CNF (hybrid) constraints, such as XOR, cardinality, and Not-All-Equal constraints. Recent work leverages polynomial representations to represent such hybrid constraints, but it relies on box constraints that can limit the use of powerful unconstrained optimizers. In this paper, we propose unconstrained continuous optimization formulations for hybrid SAT solving by penalty terms. We provide theoretical insights into when these penalty terms are necessary and demonstrate empirically that unconstrained optimizers (e.g., Adam) can enhance SAT solving on hybrid benchmarks. Our results highlight the potential of combining continuous optimization and machine-learning-based methods for effective hybrid SAT solving.


Enabling Secure and Ephemeral AI Workloads in Data Mesh Environments

arXiv.org Artificial Intelligence

Many large enterprises that operate highly governed and complex ICT environments have no efficient and effective way to support their Data and AI teams in rapidly spinning up and tearing down self-service data and compute infrastructure, to experiment with new data analytic tools, and deploy data products into operational use. This paper proposes a key piece of the solution to the overall problem, in the form of an on-demand self-service data-platform infrastructure to empower de-centralised data teams to build data products on top of centralised templates, policies and governance. The core innovation is an efficient method to leverage immutable container operating systems and infrastructure-as-code methodologies for creating, from scratch, vendor-neutral and short-lived Kubernetes clusters on-premises and in any cloud environment. Our proposed approach can serve as a repeatable, portable and cost-efficient alternative or complement to commercial Platform-as-a-Service (PaaS) offerings, and this is particularly important in supporting interoperability in complex data mesh environments with a mix of modern and legacy compute infrastructure.


ProofNet++: A Neuro-Symbolic System for Formal Proof Verification with Self-Correction

arXiv.org Artificial Intelligence

Table I presents the quantitative evaluation of ProofNet++ across three distinct datasets. The FPSR (Final Proof Success Rate) metric shows that the system performs best on the mathlib-extract dataset with a 74.9% success rate, followed by miniF2F at 68.4%, and the HOL Light Testbed trailing at 63.5%. Similarly, the PPC (Proof Production Correctness) values align with this trend, indicating higher intermediate proof accuracy on mathlib-extract (88.0%) compared to the other datasets. The EDPT (Edit Distance to Proof Target) metric reveals that mathlib-extract proofs require fewer correction steps (2.4) than miniF2F (3.2) and HOL Light (4.0), suggesting that the system is more efficient in approximating correct proofs in that domain. Latency measurements reflect verifier runtime, with mathlib-extract exhibiting the fastest average verification time (176 ms), whereas HOL Light has the highest latency (214 ms). Lastly, the average proof length varies notably, with HOL Light proofs being the longest (14.3 steps), potentially contributing to its higher latency and lower success metrics. These results indicate that while ProofNet++ demonstrates strong performance on established libraries like mathlib-extract, there is room for improvement on datasets with more complex or longer proofs, such as HOL Light. Enhancements could focus on optimizing proof search strategies and reducing verifier latency, particularly for longer proofs, to improve overall efficiency and success rates. E. Benchmark Pipeline Overview Figure 1 illustrates the full evaluation pipeline used to benchmark ProofNet++, from the initial input prompt to the final corrected proof output.


Technical Perspective: When Proofs Meet Programs: An Extension of Dependent Type Theory with Church's Thesis

Communications of the ACM

What is a mathematical proof? It can be described as a sequence of logical steps and calculations that serve as evidence of the correctness of a statement. The steps must follow rules that are accepted as correct by the community. One might think there is a set of universal rules. However, this is far from being the case.


Systems Correctness Practices at Amazon Web Services

Communications of the ACM

Amazon Web Services (AWS) strives to deliver reliable services that customers can trust completely. This requires maintaining the highest standards of security, durability, integrity, and availability--with systems correctness serving as the cornerstone for achieving these priorities. An April 2015 article published in Communications of the ACM, titled "How Amazon Web Services Uses Formal Methods," highlighted the approach for ensuring the correctness of critical services that have since become among the most widely used by AWS customers.21 Central to this approach was TLA,14 a formal specification language developed by Leslie Lamport. Our experience at AWS with TLA revealed two significant advantages of applying formal methods in practice.


RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models

arXiv.org Artificial Intelligence

Legal Judgment Prediction (LJP) is a pivotal task in legal AI. Existing semantic-enhanced LJP models integrate judicial precedents and legal knowledge for high performance. But they neglect legal reasoning logic, a critical component of legal judgments requiring rigorous logical analysis. Although some approaches utilize legal reasoning logic for high-quality predictions, their logic rigidity hinders adaptation to case-specific logical frameworks, particularly in complex cases that are lengthy and detailed. This paper proposes a rule-enhanced legal judgment prediction framework based on first-order logic (FOL) formalism and comparative learning (CL) to develop an adaptive adjustment mechanism for legal judgment logic and further enhance performance in LJP. Inspired by the process of human exam preparation, our method follows a three-stage approach: first, we initialize judgment rules using the FOL formalism to capture complex reasoning logic accurately; next, we propose a Confusion-aware Contrastive Learning (CACL) to dynamically optimize the judgment rules through a quiz consisting of confusable cases; finally, we utilize the optimized judgment rules to predict legal judgments. Experimental results on two public datasets show superior performance across all metrics. The code is publicly available{https://anonymous.4open.science/r/RLJP-FDF1}.


A Compositional Atlas for Algebraic Circuits

Neural Information Processing Systems

Circuits based on sum-product structure have become a ubiquitous representation to compactly encode knowledge, from Boolean functions to probability distributions. By imposing constraints on the structure of such circuits, certain inference queries become tractable, such as model counting and most probable configuration. Recent works have explored analyzing probabilistic and causal inference queriesas compositions of basic operators to derive tractability conditions. In this paper, we take an algebraic perspective for compositional inference, and show that a large class of queries--including marginal MAP, probabilistic answer set programming inference, and causal backdoor adjustment--correspond to a combination of basic operators over semirings: aggregation, product, and elementwise mapping. Using this framework, we uncover simple and general sufficient conditions for tractable composition of these operators, in terms of circuit properties (e.g., marginal determinism, compatibility) and conditions on the elementwise mappings.


Logical characterizations of recurrent graph neural networks with reals and floats

Neural Information Processing Systems

In pioneering work from 2019, Barceló and coauthors identified logics that precisely match the expressive power of constant iteration-depth graph neural networks (GNNs) relative to properties definable in first-order logic. In this article, we give exact logical characterizations of recurrent GNNs in two scenarios: (1) in the setting with floating-point numbers and (2) with reals. For floats, the formalism matching recurrent GNNs is a rule-based modal logic with counting, while for reals we use a suitable infinitary modal logic, also with counting. These results give exact matches between logics and GNNs in the recurrent setting without relativising to a background logic in either case, but using some natural assumptions about floating-point arithmetic. Applying our characterizations, we also prove that, relative to graph properties definable in monadic second-order logic (MSO), our infinitary and rule-based logics are equally expressive.