AITopics | Logic & Formal Reasoning

Collaborating Authors

Logic & Formal Reasoning

"I think the best hope for human-level AI is logical AI, based on the formalizing of commonsense knowledge and reasoning in mathematical logic. Formalizing common sense requires extensions to mathematical logic including nonmonotonic reasoning and extensive reification, e.g., of concepts and also contexts. The reifications require appropriate reflection schemas."
– from The Future of AI—A Manifesto by John McCarthy. AI Magazine 26(4), (2005).

News Overviews Instructional Materials AI-Alerts Classics

Towards Verified Code Reasoning by LLMs

Sistla, Meghana, Balakrishnan, Gogul, Rondon, Pat, Cambronero, José, Tufano, Michele, Chandra, Satish

arXiv.org Artificial IntelligenceNov-17-2025

While LLM-based agents are able to tackle a wide variety of code reasoning questions, the answers are not always correct. This prevents the agent from being useful in situations where high precision is desired: (1) helping a software engineer understand a new code base, (2) helping a software engineer during code review sessions, and (3) ensuring that the code generated by an automated code generation system meets certain requirements (e.g. fixes a bug, improves readability, implements a feature). As a result of this lack of trustworthiness, the agent's answers need to be manually verified before they can be trusted. Manually confirming responses from a code reasoning agent requires human effort and can result in slower developer productivity, which weakens the assistance benefits of the agent. In this paper, we describe a method to automatically validate the answers provided by a code reasoning agent by verifying its reasoning steps. At a very high level, the method consists of extracting a formal representation of the agent's response and, subsequently, using formal verification and program analysis tools to verify the agent's reasoning steps. We applied this approach to a benchmark set of 20 uninitialized variable errors detected by sanitizers and 20 program equivalence queries. For the uninitialized variable errors, the formal verification step was able to validate the agent's reasoning on 13/20 examples, and for the program equivalence queries, the formal verification step successfully caught 6/8 incorrect judgments made by the agent.

large language model, logic & formal reasoning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.26546

Country: North America > United States (0.46)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis

Neural Information Processing SystemsNov-15-2025, 06:22:14 GMT

Logic contains explainable task-solving knowledge that naturally can generalize across similar tasks.

logic & formal reasoning, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Asia > Singapore (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics

Breen, Benjamin, Del Tredici, Marco, McCarran, Jacob, Mijares, Javier Aspuru, Yin, Weichen Winston, Sulimany, Kfir, Taylor, Jacob M., Koppens, Frank H. L., Englund, Dirk

arXiv.org Artificial IntelligenceNov-14-2025

We present Ax-Prover, a multi-agent system for automated theorem proving in Lean that can solve problems across diverse scientific domains and operate either autonomously or collaboratively with human experts. To achieve this, Ax-Prover approaches scientific problem solving through formal proof generation, a process that demands both creative reasoning and strict syntactic rigor. Ax-Prover meets this challenge by equipping Large Language Models (LLMs), which provide knowledge and reasoning, with Lean tools via the Model Context Protocol (MCP), which ensure formal correctness. To evaluate its performance as an autonomous prover, we benchmark our approach against frontier LLMs and specialized prover models on two public math benchmarks and on two Lean benchmarks we introduce in the fields of abstract algebra and quantum theory. On public datasets, Ax-Prover is competitive with state-of-the-art provers, while it largely outperforms them on the new benchmarks. This shows that, unlike specialized systems that struggle to generalize, our tool-based agentic theorem prover approach offers a generalizable methodology for formal verification across diverse scientific domains. Furthermore, we demonstrate Ax-Prover's assistant capabilities in a practical use case, showing how it enabled an expert mathematician to formalize the proof of a complex cryptography theorem.

large language model, logic & formal reasoning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.12787

Country: North America > United States (1.00)

Genre:

Workflow (0.68)
Research Report > New Finding (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Checklist 1. For all authors (a)

Neural Information Processing SystemsNov-13-2025, 11:11:08 GMT

We first provide the necessary background on modal logic, which will inform the proofs in this appendix (Appendix D.1.1).

artificial intelligence, logic & formal reasoning, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.68)

Add feedback

Bridging Natural Language and ASP: A Hybrid Approach Using LLMs and AMR Parsing

Hite, Connar, Saud, Sean, Taha, Raef, Rahman, Nayim, Atahary, Tanvir, Douglass, Scott, Taha, Tarek

arXiv.org Artificial IntelligenceNov-13-2025

Answer Set Programming (ASP) is a declarative programming paradigm based on logic programming and non-monotonic reasoning. It is a tremendously powerful tool for describing and solving combinatorial problems. Like any other language, ASP requires users to learn how it works and the syntax involved. It is becoming increasingly required for those unfamiliar with programming languages to interact with code. This paper proposes a novel method of translating unconstrained English into ASP programs for logic puzzles using an LLM and Abstract Meaning Representation (AMR) graphs. Everything from ASP rules, facts, and constraints is generated to fully represent and solve the desired problem. Example logic puzzles are used to demonstrate the capabilities of the system. While most current methods rely entirely on an LLM, our system minimizes the role of the LLM only to complete straightforward tasks. The LLM is used to simplify natural language sentences, identify keywords, and generate simple facts. The AMR graphs are then parsed from simplified language and used to generate ASP constraints systematically. The system successfully creates an entire ASP program that solves a combinatorial logic problem. This approach is a significant first step in creating a lighter-weight, explainable system that converts natural language to solve complex logic problems.

constraint, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.08715

Country: North America > United States (0.68)

Genre: Research Report > Promising Solution (0.48)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Exploring the Paradigm Shift from Grounding to Skolemization for Complex Query Answering on Knowledge Graphs

Lu, Yuyin, Chen, Hegang, Xie, Shanrui, Rao, Yanghui, Xie, Haoran, Wang, Fu Lee, Li, Qing

arXiv.org Artificial IntelligenceNov-13-2025

Complex Query Answering (CQA) over incomplete Knowledge Graphs (KGs), typically formalized as reasoning with Existential First-Order predicate logic with one free variable (EFO\textsubscript{1}), faces a fundamental tradeoff between logic fidelity and computational efficiency. This work establishes a Grounding-Skolemization dichotomy to systematically analyze this challenge and motivate a paradigm shift in CQA. While Grounding-based methods inherently suffer from combinatorial explosion, most Skolemization-based methods neglect to explicitly model Skolem functions and compromise logical consistency. To address these limitations, we propose the Logic-constrained Vector Symbolic Architecture (LVSA), a neuro-symbolic framework that unifies a differentiable Skolemization module and a neural negator, as well as a logical constraint-driven optimization protocol to harmonize geometric and logical requirements. Theoretically, LVSA guarantees universality for all EFO\textsubscript{1} queries with low computational complexity. Empirically, it outperforms state-of-the-art Skolemization-based methods and reduces inference costs by orders of magnitude compared to Grounding-based baselines.

logic & formal reasoning, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.10837

Country: Asia > China > Hong Kong (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

How Artificial Intelligence Leads to Knowledge Why: An Inquiry Inspired by Aristotle's Posterior Analytics

Eelink, Guus, Rückschloß, Kilian, Weitkämper, Felix

arXiv.org Artificial IntelligenceNov-12-2025

Bayesian networks and causal models provide frameworks for handling queries about external interventions and counterfactuals, enabling tasks that go beyond what probability distributions alone can address. While these formalisms are often informally described as capturing causal knowledge, there is a lack of a formal theory characterizing the type of knowledge required to predict the effects of external interventions. This work introduces the theoretical framework of causal systems to clarify Aristotle's distinction between knowledge that and knowledge why within artificial intelligence. By interpreting existing artificial intelligence technologies as causal systems, it investigates the corresponding types of knowledge. Furthermore, it argues that predicting the effects of external interventions is feasible only with knowledge why, providing a more precise understanding of the knowledge necessary for such tasks.

artificial intelligence, logic & formal reasoning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2504.0243

Country: Europe > Germany (0.67)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Add feedback

An Epistemic Perspective on Agent Awareness

Naumov, Pavel, Pavlova, Alexandra

arXiv.org Artificial IntelligenceNov-11-2025

The paper proposes to treat agent awareness as a form of knowledge, breaking the tradition in the existing literature on awareness. It distinguishes the de re and de dicto forms of such knowledge. The work introduces two modalities capturing these forms and formally specifies their meaning using a version of 2D-semantics. The main technical result is a sound and complete logical system describing the interplay between the two proposed modalities and the standard "knowledge of the fact" modality.

artificial intelligence, logic & formal reasoning, logic programming, (18 more...)

arXiv.org Artificial Intelligence

2511.05977

Country:

Asia (0.68)
Europe > United Kingdom > England (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.40)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.70)
Information Technology > Artificial Intelligence > Robots (0.68)

Add feedback

UTF-8 Plumbing: Byte-level Tokenizers Unavoidably Enable LLMs to Generate Ill-formed UTF-8

Firestone, Preston, Ugare, Shubham, Singh, Gagandeep, Misailovic, Sasa

arXiv.org Artificial IntelligenceNov-11-2025

Subword tokenization segments input text according to a pre-defined vocabulary to feed it into a language model; the language model, in turn, generates a sequence made from this same vocabulary. The members of the vocabulary can be built of code points or bytes. Using code points means that all members of the vocabulary are valid UTF-8 characters. However, it also requires thousands of initial members to achieve acceptable coverage of inputs. Beginning with bytes, on the contrary, avoids out-of-vocabulary errors with only 256 initial members of the vocabulary, but the members of the vocabulary and sequences of them are not guaranteed to be valid UTF-8. Sequences that are not valid UTF-8 break code that assumes its input to be valid UTF-8. Applications of language models must account for the breakage thereby introduced. In this paper, we formalize tokenization using monoid theory and prove that tokenizers whose vocabularies contain tokens that are ill-formed UTF-8 can always produce sequences that are ill-formed UTF-8. We demonstrate formally that attempting to incrementally convert tokens back to a string and interpret the results as UTF-8 gives different results than converting the whole sequence of tokens at once. This formal result predicts real-world bugs: we evaluate mitigations for the problem identified and provide case studies of major foundation models, serving engines, and constrained generation systems.

large language model, logic & formal reasoning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.05578

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.68)

Genre: Research Report (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)

Add feedback

An Automated Theorem Generator with Theoretical Foundation Based on Rectangular Standard Contradiction

Xu, Yang, Liu, Peiyao, Chen, Shuwei, Liu, Jun

arXiv.org Artificial IntelligenceNov-7-2025

Currently, there is a lack of rigorous theoretical system for systematically generating non-trivial and logically valid theorems. Addressing this critical gap, this paper conducts research to propose a novel automated theorem generation theory and tool. Based on the concept of standard contradiction which possesses unique deductive advantages, this paper defines and proves, for the first time, a new logical structure known as rectangular standard contradiction. Centered on this structure, a complete Automated Theorem Generation (ATG) theory is put forward. Theoretical proofs clarify two core properties of rectangular standard contradiction: first, it is a standard contradiction (necessarily unsatisfiable); second, it exhibits non-redundancy (the remaining clause set becomes satisfiable after removing any clause). Leveraging these properties, this paper proves that partitioning a rectangular standard contradiction into a premise subset $A$ and negation of its complement $H$, a valid theorem $A \vdash \neg H$ can be formed, and all such theorems are logically equivalent. To implement this theory, an efficient template-based ATG algorithm is designed, and a Rectangular Automated Theorem Generator is developed. This research enables machines to transition from "verifiers" to "discoverers", opening up new avenues for fundamental research in the fields of logic and artificial intelligence.

artificial intelligence, logic & formal reasoning, standard contradiction, (14 more...)

arXiv.org Artificial Intelligence

2511.04092

Country: Asia > China > Sichuan Province (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback