AITopics | Logic & Formal Reasoning

Collaborating Authors

Logic & Formal Reasoning

"I think the best hope for human-level AI is logical AI, based on the formalizing of commonsense knowledge and reasoning in mathematical logic. Formalizing common sense requires extensions to mathematical logic including nonmonotonic reasoning and extensive reification, e.g., of concepts and also contexts. The reifications require appropriate reflection schemas."
– from The Future of AI—A Manifesto by John McCarthy. AI Magazine 26(4), (2005).

News Overviews Instructional Materials AI-Alerts Classics

Compiling Metric Temporal Answer Set Programming

Becker, Arvid, Cabalar, Pedro, Diéguez, Martin, Romero, Javier, Hahn, Susana, Schaub, Torsten

arXiv.org Artificial IntelligenceJun-11-2025

We develop a computational approach to Metric Answer Set Programming (ASP) to allow for expressing quantitative temporal constrains, like durations and deadlines. A central challenge is to maintain scalability when dealing with fine-grained timing constraints, which can significantly exacerbate ASP's grounding bottleneck. To address this issue, we leverage extensions of ASP with difference constraints, a simplified form of linear constraints, to handle time-related aspects externally. Our approach effectively decouples metric ASP from the granularity of time, resulting in a solution that is unaffected by time precision.

artificial intelligence, constraint, logic & formal reasoning, (17 more...)

arXiv.org Artificial Intelligence

2506.0815

Country: Europe (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback

Premise Selection for a Lean Hammer

Zhu, Thomas, Clune, Joshua, Avigad, Jeremy, Jiang, Albert Qiaochu, Welleck, Sean

arXiv.org Artificial IntelligenceJun-10-2025

Neural methods are transforming automated reasoning for proof assistants, yet integrating these advances into practical verification workflows remains challenging. Hammers are tools that interface with external automatic theorem provers to automate tedious reasoning steps. They have dramatically improved productivity in proof assistants, but the Lean proof assistant still does not have a hammer despite its growing popularity. We present LeanHammer, the first end-to-end domain-general hammer for Lean, built on a novel neural premise selection system for a hammer in dependent type theory. Unlike existing Lean premise selectors, our approach dynamically adapts to user-specific contexts and combines with symbolic proof search and reconstruction to create a practical hammer. With comprehensive evaluations, we show that our premise selector enables LeanHammer to solve 21\% more goals relative to existing premise selectors, and generalize well to diverse domains. Our work bridges the gap between neural retrieval and symbolic reasoning, making formal verification more accessible to researchers and practitioners.

artificial intelligence, logic & formal reasoning, theorem, (16 more...)

arXiv.org Artificial Intelligence

2506.07477

Country:

Europe (0.93)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

From Axioms to Algorithms: Mechanized Proofs of the vNM Utility Theorem

Jingyuan, Li

arXiv.org Artificial IntelligenceJun-10-2025

This paper presents a comprehensive formalization of the von Neumann-Morgenstern (vNM) expected utility theorem using the Lean 4 interactive theorem prover. We implement the classical axioms of preference-completeness, transitivity, continuity, and independence-enabling machine-verified proofs of both the existence and uniqueness of utility representations. Our formalization captures the mathematical structure of preference relations over lotteries, verifying that preferences satisfying the vNM axioms can be represented by expected utility maximization. Our contributions include a granular implementation of the independence axiom, formally verified proofs of fundamental claims about mixture lotteries, constructive demonstrations of utility existence, and computational experiments validating the results. We prove equivalence to classical presentations while offering greater precision at decision boundaries. This formalization provides a rigorous foundation for applications in economic modeling, AI alignment, and management decision systems, bridging the gap between theoretical decision theory and computational implementation.

decision support system, logic & formal reasoning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.07066

Genre: Research Report (0.49)

Industry:

Information Technology (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A dependently-typed calculus of event telicity and culminativity

Kovalev, Pavel, Angiuli, Carlo

arXiv.org Artificial IntelligenceJun-10-2025

We present a dependently-typed cross-linguistic framework for analyzing the telicity and culminativity of events, accompanied by examples of using our framework to model English sentences. Our framework consists of two parts. In the nominal domain, we model the boundedness of noun phrases and its relationship to subtyping, delimited quantities, and adjectival modification. In the verbal domain we define a dependent event calculus, modeling telic events as those whose undergoer is bounded, culminating events as telic events that achieve their inherent endpoint, and consider adverbial modification. In both domains we pay particular attention to associated entailments. Our framework is defined as an extension of intensional Martin-Löf dependent type theory, and the rules and examples in this paper have been formalized in the Agda proof assistant.

logic & formal reasoning, natural language, undergoer, (19 more...)

arXiv.org Artificial Intelligence

2506.06968

Country:

North America > United States (0.67)
Europe > Netherlands (0.46)
Europe > United Kingdom > England (0.14)

Genre: Instructional Material (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?

He, Zhitao, Lyu, Zongwei, Chen, Dazhong, Guo, Dadi, Fung, Yi R.

arXiv.org Artificial IntelligenceJun-9-2025

Numerous theorems, such as those in geometry, are often presented in multimodal forms (e.g., diagrams). Humans benefit from visual reasoning in such settings, using diagrams to gain intuition and guide the proof process. Modern Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in solving a wide range of mathematical problems. However, the potential of MLLMs as Automated Theorem Provers (ATPs), specifically in the multimodal domain, remains underexplored. In this paper, we introduce the Multimodal Automated Theorem Proving benchmark (MATP-BENCH), a new Multimodal, Multi-level, and Multi-language benchmark designed to evaluate MLLMs in this role as multimodal automated theorem provers. MATP-BENCH consists of 1056 multimodal theorems drawn from high school, university, and competition-level mathematics. All these multimodal problems are accompanied by formalizations in Lean 4, Coq and Isabelle, thus making the benchmark compatible with a wide range of theorem-proving frameworks. MATP-BENCH requires models to integrate sophisticated visual understanding with mastery of a broad spectrum of mathematical knowledge and rigorous symbolic reasoning to generate formal proofs. We use MATP-BENCH to evaluate a variety of advanced multimodal language models. Existing methods can only solve a limited number of the MATP-BENCH problems, indicating that this benchmark poses an open challenge for research on automated theorem proving.

large language model, logic & formal reasoning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.06034

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Education > Educational Setting (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Fuzzy Lattice-based Description Logic

Ding, Yiwen, Manoorkar, Krishna

arXiv.org Artificial IntelligenceJun-9-2025

Recently, description logic LE-ALC was introduced for reasoning in the semantic environment of enriched formal contexts, and a polynomial-time tableaux algorithm was developed to check the consistency of knowledge bases with acyclic TBoxes. In this work, we introduce a fuzzy generalization of LE-ALC called LE-FALC which provides a description logic counterpart of many-valued normal non-distributive logic a.k.a. many-valued LE-logic. This description logic can be used to represent and reason about knowledge in the formal framework of fuzzy formal contexts and fuzzy formal concepts. We provide a tableaux algorithm that provides a complete and sound polynomial-time decision procedure to check the consistency of LE-FALC ABoxes. As a result, we also obtain an exponential-time decision procedure for checking the consistency of LE-FALC with acyclic TBoxes by unraveling.

artificial intelligence, logic & formal reasoning, resp, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.4204/EPTCS.421.3

2506.05833

Country:

Europe > Netherlands (0.28)
North America > Mexico (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Description Logic (1.00)

Add feedback

Non-Asymptotic Length Generalization

Chen, Thomas, Ma, Tengyu, Li, Zhiyuan

arXiv.org Artificial IntelligenceJun-9-2025

Length generalization is the ability of a learning algorithm to learn a hypothesis which generalizes to longer inputs than the inputs in the training set. In this paper, we provide provable guarantees of length generalization for various classes of functions in an idealized setting. First, we formalize the framework of non-asymptotic length generalization, which requires a computable upper bound for the minimum input length that guarantees length generalization, as a function of the complexity of ground-truth function under some given complexity measure. We refer to this minimum input length to length generalize as length complexity. We show the Minimum-Complexity Interpolator learning algorithm achieves optimal length complexity. We further show that whether a function class admits non-asymptotic length generalization is equivalent to the decidability of its language equivalence problem, which implies that there is no computable upper bound for the length complexity of Context-Free Grammars. On the positive side, we show that the length complexity of Deterministic Finite Automata is $2n - 2$ where $n$ is the number of states of the ground-truth automaton. Our main results are upper bounds of length complexity for a subset of a transformer-related function class called C-RASP (Yang & Chiang, 2024). We show that the length complexity of 1-layer C-RASP functions is $O(T^2)$ when the ground-truth function has precision $T$, and that the length complexity of 2-layer C-RASP functions is $O(T^{O(K)})$ when the ground-truth function has precision $T$ and $K$ heads.

length generalization, logic & formal reasoning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.03085

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.45)

Add feedback

Trustworthiness Preservation by Copies of Machine Learning Systems

Ceragioli, Leonardo, Primiero, Giuseppe

arXiv.org Artificial IntelligenceJun-6-2025

A common practice of ML systems development concerns the training of the same model under different data sets, and the use of the same (training and test) sets for different learning models. The first case is a desirable practice for identifying high quality and unbiased training conditions. The latter case coincides with the search for optimal models under a common dataset for training. These differently obtained systems have been considered akin to copies. In the quest for responsible AI, a legitimate but hardly investigated question is how to verify that trustworthiness is preserved by copies. In this paper we introduce a calculus to model and verify probabilistic complex queries over data and define four distinct notions: Justifiably, Equally, Weakly and Almost Trustworthy which can be checked analysing the (partial) behaviour of the copy with respect to its original. We provide a study of the relations between these notions of trustworthiness, and how they compose with each other and under logical operations. The aim is to offer a computational tool to check the trustworthiness of possibly complex systems copied from an original whose behavour is known.

artificial intelligence, logic & formal reasoning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.05203

Country: Europe (0.67)

Genre: Research Report (0.63)

Industry:

Health & Medicine (0.73)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Add feedback

Comparison of different Unique hard attention transformer models by the formal languages they can recognize

Ryvkin, Leonid

arXiv.org Artificial IntelligenceJun-5-2025

The goal of this note is to give an overview of the capabilities of different flavors of unique hard attention transformer encoders in terms of the formal languages they are able to recognize. This study is relevant in the context of the rising use of large language models, which typically follow a transformer architecture. While the model we will be primarily investigating has features very distinct from real-world transformers (we will comment on the distinction later) they can still give valuable insights into the principle underlying transformer capabilities. Roughly speaking, a transformer can be thought of function that, given an input of any length, can construct a sequence of the same length. It transforms one sequence into the other.

large language model, logic & formal reasoning, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2506.0337

Genre: Overview (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.85)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)

Add feedback

Towards Generating Controllable and Solvable Geometry Problem by Leveraging Symbolic Deduction Engine

Jiang, Zhuoxuan, Zhang, Tianyang, Peng, Peiyan, Chen, Jing, Xun, Yinong, Zhang, Haotian, Li, Lichi, Li, Yong, Zhang, Shaohua

arXiv.org Artificial IntelligenceJun-4-2025

Generating high-quality geometry problems is both an important and challenging task in education. Compared to math word problems, geometry problems further emphasize multi-modal formats and the translation between informal and formal languages. In this paper, we introduce a novel task for geometry problem generation and propose a new pipeline method: the Symbolic Deduction Engine-based Geometry Problem Generation framework (SDE-GPG). The framework leverages a symbolic deduction engine and contains four main steps: (1) searching a predefined mapping table from knowledge points to extended definitions, (2) sampling extended definitions and performing symbolic deduction, (3) filtering out unqualified problems, and (4) generating textual problems and diagrams. Specifically, our method supports to avoid inherent biases in translating natural language into formal language by designing the mapping table, and guarantees to control the generated problems in terms of knowledge points and difficulties by an elaborate checking function. With obtained formal problems, they are translated to natural language and the accompanying diagrams are automatically drew by rule-based methods. We conduct experiments using real-world combinations of knowledge points from two public datasets. The results demonstrate that the SDE-GPG can effectively generate readable, solvable and controllable geometry problems.

large language model, logic & formal reasoning, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2506.02565

Country: Asia > China (0.30)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback