Goto

Collaborating Authors

 Logic & Formal Reasoning


FMC: Formalization of Natural Language Mathematical Competition Problems

arXiv.org Artificial Intelligence

Efficient and accurate autoformalization methods, which leverage large-scale datasets of extensive natural language mathematical problems to construct formal language datasets, are key to advancing formal mathematical reasoning. In this paper, we propose an autoformalization pipeline based on large language models with error feedback, achieving a fully automatic and training-free formalization approach. Using this pipeline, we curate an Olympiad-level dataset aligning natural language problems with Lean formalizations. The dataset comprises $3,922$ mathematical problems in natural language and $9,787$ in Lean, of which $64.46\%$ were assessed as at least above-average quality, making it suitable as a benchmark for automated theorem provers. Additionally, we investigate the formalization and reasoning capabilities of various LLMs and empirically demonstrate that few-shot learning, error feedback, and increasing sampling numbers enhance the autoformalization process. Experiments of three automated theorem provers on the \dataset\ dataset also highlight its challenging nature and its value as a benchmark for formal reasoning tasks.


Defining neurosymbolic AI

arXiv.org Artificial Intelligence

Neurosymbolic AI focuses on integrating learning and reasoning, in particular, on unifying logical and neural representations. Despite the existence of an alphabet soup of neurosymbolic AI systems, the field is lacking a generally accepted formal definition of what neurosymbolic models and inference really are. We introduce a formal definition for neurosymbolic AI that makes abstraction of its key ingredients. More specifically, we define neurosymbolic inference as the computation of an integral over a product of a logical and a belief function. We show that our neurosymbolic AI definition makes abstraction of key representative neurosymbolic AI systems.


A Fuzzy Approach to the Specification, Verification and Validation of Risk-Based Ethical Decision Making Models

arXiv.org Artificial Intelligence

The ontological and epistemic complexities inherent in the moral domain make it challenging to establish clear standards for evaluating the performance of a moral machine. In this paper, we present a formal method to describe Ethical Decision Making models based on ethical risk assessment. Then, we show how these models that are specified as fuzzy rules can be verified and validated using fuzzy Petri nets. A case study from the medical field is considered to illustrate the proposed approach.


Extending Defeasibility for Propositional Standpoint Logics

arXiv.org Artificial Intelligence

In this paper, we introduce a new defeasible version of propositional standpoint logic by integrating Kraus et al.'s defeasible conditionals, Britz and Varzinczak's notions of defeasible necessity and distinct possibility, along with Leisegang et al.'s approach to defeasibility into the standpoint logics of Gรณmez รlvarez and Rudolph. The resulting logical framework allows for the expression of defeasibility on the level of implications, standpoint modal operators, and standpoint-sharpening statements. We provide a preferential semantics for this extended language and propose a tableaux calculus, which is shown to be sound and complete with respect to preferential entailment. We also establish the computational complexity of the tableaux procedure to be in PSpace.


Why this and not that? A Logic-based Framework for Contrastive Explanations

arXiv.org Artificial Intelligence

We define several canonical problems related to contrastive explanations, each answering a question of the form ''Why P but not Q?''. The problems compute causes for both P and Q, explicitly comparing their differences. We investigate the basic properties of our definitions in the setting of propositional logic. We show, inter alia, that our framework captures a cardinality-minimal version of existing contrastive explanations in the literature. Furthermore, we provide an extensive analysis of the computational complexities of the problems. We also implement the problems for CNF-formulas using answer set programming and present several examples demonstrating how they work in practice.


The end of radical concept nativism

arXiv.org Artificial Intelligence

Though humans seem to be remarkable learners, arguments in cognitive science and philosophy of mind have long maintained that learning something fundamentally new is impossible. Specifically, Jerry Fodor's arguments for radical concept nativism hold that most, if not all, concepts are innate and that what many call concept learning never actually leads to the acquisition of new concepts. These arguments have deeply affected cognitive science, and many believe that the counterarguments to radical concept nativism have been either unsuccessful or only apply to a narrow class of concepts. This paper first reviews the features and limitations of prior arguments. We then identify three critical points - related to issues of expressive power, conceptual structure, and concept possession - at which the arguments in favor of radical concept nativism diverge from describing actual human cognition. We use ideas from computer science and information theory to formalize the relevant ideas in ways that are arguably more scientifically productive. We conclude that, as a result, there is an important sense in which people do indeed learn new concepts.


Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving

arXiv.org Artificial Intelligence

Automated Theorem Proving (ATP) in formal languages is a foundational challenge for AI. While Large Language Models (LLMs) have driven remarkable progress, a significant gap remains between their powerful informal reasoning capabilities and their weak formal proving performance. Recent studies show that the informal accuracy exceeds 80% while formal success remains below 8% on benchmarks like PutnamBench. We argue this gap persists because current state-of-the-art provers, by tightly coupling reasoning and proving, are trained with paradigms that inadvertently punish deep reasoning in favor of shallow, tactic-based strategies. To bridge this fundamental gap, we propose a novel framework that decouples high-level reasoning from low-level proof generation. Our approach utilizes two distinct, specialized models: a powerful, general-purpose Reasoner to generate diverse, strategic subgoal lemmas, and an efficient Prover to rigorously verify them. This modular design liberates the model's full reasoning potential and bypasses the pitfalls of end-to-end training. We evaluate our method on a challenging set of post-2000 IMO problems, a problem set on which no prior open-source prover has reported success. Our decoupled framework successfully solves 5 of these problems, demonstrating a significant step towards automated reasoning on exceptionally difficult mathematical challenges. To foster future research, we release our full dataset of generated and verified lemmas for a wide range of IMO problems, available at https://tencent-imo.github.io/ .


CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

arXiv.org Artificial Intelligence

Translating natural language mathematical statements into formal, executable code is a fundamental challenge in automated theorem proving. While prior work has focused on generation and compilation success, little attention has been paid to the critic phase-the evaluation of whether generated formalizations truly capture the semantic intent of the original problem. In this paper, we introduce CriticLean, a novel critic-guided reinforcement learning framework that elevates the role of the critic from a passive validator to an active learning component. Specifically, first, we propose the CriticLeanGPT, trained via supervised fine-tuning and reinforcement learning, to rigorously assess the semantic fidelity of Lean 4 formalizations. Then, we introduce CriticLeanBench, a benchmark designed to measure models' ability to distinguish semantically correct from incorrect formalizations, and demonstrate that our trained CriticLeanGPT models can significantly outperform strong open- and closed-source baselines. Building on the CriticLean framework, we construct FineLeanCorpus, a dataset comprising over 285K problems that exhibits rich domain diversity, broad difficulty coverage, and high correctness based on human evaluation. Overall, our findings highlight that optimizing the critic phase is essential for producing reliable formalizations, and we hope our CriticLean will provide valuable insights for future advances in formal mathematical reasoning.


Current Practices for Building LLM-Powered Reasoning Tools Are Ad Hoc -- and We Can Do Better

arXiv.org Artificial Intelligence

There is growing excitement about building software verifiers, synthesizers, and other Automated Reasoning (AR) tools by combining traditional symbolic algorithms and Large Language Models (LLMs). Unfortunately, the current practice for constructing such neurosymbolic AR systems is an ad hoc programming model that does not have the strong guarantees of traditional symbolic algorithms, nor a deep enough synchronization of neural networks and symbolic reasoning to unlock the full potential of LLM-powered reasoning. I propose Neurosymbolic Transition Systems as a principled computational model that can underlie infrastructure for building neurosymbolic AR tools. In this model, symbolic state is paired with intuition, and state transitions operate over symbols and intuition in parallel. I argue why this new paradigm can scale logical reasoning beyond current capabilities while retaining the strong guarantees of symbolic algorithms, and I sketch out how the computational model I propose can be reified in a logic programming language.


Automated Reasoning for Vulnerability Management by Design

arXiv.org Artificial Intelligence

For securing systems, it is essential to manage their vulnerability posture and design appropriate security controls. Vulnerability management allows to proactively address vulnerabilities by incorporating pertinent security controls into systems designs. Current vulnerability management approaches do not support systematic reasoning about the vulnerability postures of systems designs. To effectively manage vulnerabilities and design security controls, we propose a formally grounded automated reasoning mechanism. We integrate the mechanism into an open-source security design tool and demonstrate its application through an illustrative example driven by real-world challenges. The automated reasoning mechanism allows system designers to identify vulnerabilities that are applicable to a specific system design, explicitly specify vulnerability mitigation options, declare selected controls, and thus systematically manage vulnerability postures.