AITopics | Logic & Formal Reasoning

Collaborating Authors

Logic & Formal Reasoning

"I think the best hope for human-level AI is logical AI, based on the formalizing of commonsense knowledge and reasoning in mathematical logic. Formalizing common sense requires extensions to mathematical logic including nonmonotonic reasoning and extensive reification, e.g., of concepts and also contexts. The reifications require appropriate reflection schemas."
– from The Future of AI—A Manifesto by John McCarthy. AI Magazine 26(4), (2005).

News Overviews Instructional Materials AI-Alerts Classics

Anonymous Public Announcements

Ågotnes, Thomas, Galimullin, Rustam, Satoh, Ken, Tojo, Satoshi

arXiv.org Artificial IntelligenceApr-22-2025

We formalise the notion of an anonymous public announcement in the tradition of public announcement logic. Such announcements can be seen as in-between a public announcement from ``the outside" (an announcement of $ϕ$) and a public announcement by one of the agents (an announcement of $K_aϕ$): we get more information than just $ϕ$, but not (necessarily) about exactly who made it. Even if such an announcement is prima facie anonymous, depending on the background knowledge of the agents it might reveal the identity of the announcer: if I post something on a message board, the information might reveal who I am even if I don't sign my name. Furthermore, like in the Russian Cards puzzle, if we assume that the announcer's intention was to stay anonymous, that in fact might reveal more information. In this paper we first look at the case when no assumption about intentions are made, in which case the logic with an anonymous public announcement operator is reducible to epistemic logic. We then look at the case when we assume common knowledge of the intention to stay anonymous, which is both more complex and more interesting: in several ways it boils down to the notion of a ``safe" announcement (again, similarly to Russian Cards). Main results include formal expressivity results and axiomatic completeness for key logical languages.

artificial intelligence, logic & formal reasoning, logic programming, (18 more...)

arXiv.org Artificial Intelligence

2504.12546

Country:

Asia > Japan (0.28)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)

Add feedback

Proof-Carrying Neuro-Symbolic Code

Komendantskaya, Ekaterina

arXiv.org Artificial IntelligenceApr-17-2025

This invited paper introduces the concept of "proof-carrying neuro-symbolic code" and explains its meaning and value, from both the "neural" and the "symbolic" perspectives. The talk outlines the first successes and challenges that this new area of research faces. Keywords: Neural Networks Cyber-Physical System Verification Programming Languages Neuro-Symbolic Programs. 1 Neuro-Symbolic Proofs and Programs Proof Carrying Code is a long tradition within programming language research, broadly referring to methods that interleave verification with executable code, thus avoiding the inevitable discrepancies that arise when the code and the proofs are handled in different languages. Although the term was coined by Necula [50] almost three decades ago, with time, it grew to encompass any languages that are powerful enough to handle both the coding and the proving. Examples are dependently-typed (Agda, Idris, Coq/Rocq) and refinement-typed (F*, Liquid Haskell) languages.

artificial intelligence, logic & formal reasoning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.12031

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Information Technology (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning

Wang, Haiming, Unsal, Mert, Lin, Xiaohan, Baksys, Mantas, Liu, Junqi, Santos, Marco Dos, Sung, Flood, Vinyes, Marina, Ying, Zhenzhe, Zhu, Zekai, Lu, Jianqiao, de Saxcé, Hugues, Bailey, Bolton, Song, Chendong, Xiao, Chenjun, Zhang, Dehao, Zhang, Ebony, Pu, Frederick, Zhu, Han, Liu, Jiawei, Bayer, Jonas, Michel, Julien, Yu, Longhui, Dreyfus-Schmidt, Léo, Tunstall, Lewis, Pagani, Luigi, Machado, Moreira, Bourigault, Pauline, Wang, Ran, Polu, Stanislas, Barroyer, Thibaut, Li, Wen-Ding, Niu, Yazhe, Fleureau, Yann, Hu, Yangyang, Yu, Zhouliang, Wang, Zihan, Yang, Zhilin, Liu, Zhengying, Li, Jia

arXiv.org Artificial IntelligenceApr-16-2025

We introduce Kimina-Prover Preview, a large language model that pioneers a novel reasoning-driven exploration paradigm for formal theorem proving, as showcased in this preview release. Trained with a large-scale reinforcement learning pipeline from Qwen2.5-72B, Kimina-Prover demonstrates strong performance in Lean 4 proof generation by employing a structured reasoning pattern we term \textit{formal reasoning pattern}. This approach allows the model to emulate human problem-solving strategies in Lean, iteratively generating and refining proof steps. Kimina-Prover sets a new state-of-the-art on the miniF2F benchmark, reaching 80.7% with pass@8192. Beyond improved benchmark performance, our work yields several key insights: (1) Kimina-Prover exhibits high sample efficiency, delivering strong results even with minimal sampling (pass@1) and scaling effectively with computational budget, stemming from its unique reasoning pattern and RL training; (2) we demonstrate clear performance scaling with model size, a trend previously unobserved for neural theorem provers in formal mathematics; (3) the learned reasoning style, distinct from traditional search algorithms, shows potential to bridge the gap between formal verification and informal mathematical intuition. We open source distilled versions with 1.5B and 7B parameters of Kimina-Prover

large language model, logic & formal reasoning, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2504.11354

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(3 more...)

Add feedback

Automated Testing of COBOL to Java Transformation

Hans, Sandeep, Kumar, Atul, Yasue, Toshikai, Ono, Kouichi, Krishnan, Saravanan, Sondhi, Devika, Satoh, Fumiko, Mitchell, Gerald, Kumar, Sachin, Saha, Diptikalyan

arXiv.org Artificial IntelligenceApr-16-2025

Recent advances in Large Language Model (LLM) based Generative AI techniques have made it feasible to translate enterprise-level code from legacy languages such as COBOL to modern languages such as Java or Python. While the results of LLM-based automatic transformation are encouraging, the resulting code cannot be trusted to correctly translate the original code, making manual validation of translated Java code from COBOL a necessary but time-consuming and labor-intensive process. In this paper, we share our experience of developing a testing framework for IBM Watsonx Code Assistant for Z (WCA4Z) [5], an industrial tool designed for COBOL to Java translation. The framework automates the process of testing the functional equivalence of the translated Java code against the original COBOL programs in an industry context. Our framework uses symbolic execution to generate unit tests for COBOL, mocking external calls and transforming them into JUnit tests to validate semantic equivalence with translated Java. The results not only help identify and repair any detected discrepancies but also provide feedback to improve the AI model.

large language model, machine learning, programming language, (25 more...)

arXiv.org Artificial Intelligence

2504.10548

Country: Europe > Austria (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

A convergence law for continuous logic and continuous structures with finite domains

Koponen, Vera

arXiv.org Artificial IntelligenceApr-15-2025

We consider continuous relational structures with finite domain $[n] := \{1, \ldots, n\}$ and a many valued logic, $CLA$, with values in the unit interval and which uses continuous connectives and continuous aggregation functions. $CLA$ subsumes first-order logic on ``conventional'' finite structures. To each relation symbol $R$ and identity constraint $ic$ on a tuple the length of which matches the arity of $R$ we associate a continuous probability density function $μ_R^{ic} : [0, 1] \to [0, \infty)$. We also consider a probability distribution on the set $\mathbf{W}_n$ of continuous structures with domain $[n]$ which is such that for every relation symbol $R$, identity constraint $ic$, and tuple $\bar{a}$ satisfying $ic$, the distribution of the value of $R(\bar{a})$ is given by $μ_R^{ic}$, independently of the values for other relation symbols or other tuples. In this setting we prove that every formula in $CLA$ is asymptotically equivalent to a formula without any aggregation function. This is used to prove a convergence law for $CLA$ which reads as follows for formulas without free variables: If $φ\in CLA$ has no free variable and $I \subseteq [0, 1]$ is an interval, then there is $α\in [0, 1]$ such that, as $n$ tends to infinity, the probability that the value of $φ$ is in $I$ tends to $α$.

identity constraint, logic & formal reasoning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2504.08923

Country: Europe (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Polygon: Symbolic Reasoning for SQL using Conflict-Driven Under-Approximation Search

Zhao, Pinhan, Wang, Yuepeng, Wang, Xinyu

arXiv.org Artificial IntelligenceApr-10-2025

We present a novel symbolic reasoning engine for SQL which can efficiently generate an input $I$ for $n$ queries $P_1, \cdots, P_n$, such that their outputs on $I$ satisfy a given property (expressed in SMT). This is useful in different contexts, such as disproving equivalence of two SQL queries and disambiguating a set of queries. Our first idea is to reason about an under-approximation of each $P_i$ -- that is, a subset of $P_i$'s input-output behaviors. While it makes our approach both semantics-aware and lightweight, this idea alone is incomplete (as a fixed under-approximation might miss some behaviors of interest). Therefore, our second idea is to perform search over an expressive family of under-approximations (which collectively cover all program behaviors of interest), thereby making our approach complete. We have implemented these ideas in a tool, Polygon, and evaluated it on over 30,000 benchmarks across two tasks (namely, SQL equivalence refutation and query disambiguation). Our evaluation results show that Polygon significantly outperforms all prior techniques.

artificial intelligence, encuasemantic, logic & formal reasoning, (19 more...)

arXiv.org Artificial Intelligence

2504.06542

Country:

Europe (0.92)
North America > United States > California > Los Angeles County (0.27)

Genre: Research Report > New Finding (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.67)

Add feedback

The Axiom-Based Atlas: A Structural Mapping of Theorems via Foundational Proof Vectors

Yoo, Harim

arXiv.org Artificial IntelligenceMar-31-2025

The Axiom-Based Atlas is a novel framework that structurally represents mathematical theorems as proof vectors over foundational axiom systems. By mapping the logical dependencies of theorems onto vectors indexed by axioms - such as those from Hilbert geometry, Peano arithmetic, or ZFC - we offer a new way to visualize, compare, and analyze mathematical knowledge. This vector-based formalism not only captures the logical foundation of theorems but also enables quantitative similarity metrics - such as cosine distance - between mathematical results, offering a new analytic layer for structural comparison. Using heatmaps, vector clustering, and AI-assisted modeling, this atlas enables the grouping of theorems by logical structure, not just by mathematical domain. We also introduce a prototype assistant (Atlas-GPT) that interprets natural language theorems and suggests likely proof vectors, supporting future applications in automated reasoning, mathematical education, and formal verification. This direction is partially inspired by Terence Tao's recent reflections on the convergence of symbolic and structural mathematics. The Axiom-Based Atlas aims to provide a scalable, interpretable model of mathematical reasoning that is both human-readable and AI-compatible, contributing to the future landscape of formal mathematical systems.

logic & formal reasoning, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2504.00063

Country: North America > United States > Texas (0.04)

Genre: Research Report (0.64)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.89)

Add feedback

CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis

Wei, Anjiang, Suresh, Tarun, Cao, Jiannan, Kannan, Naveen, Wu, Yuheng, Yan, Kai, Teixeira, Thiago S. F. X., Wang, Ke, Aiken, Alex

arXiv.org Artificial IntelligenceMar-29-2025

Inductive program synthesis, or programming by example, requires synthesizing functions from input-output examples that generalize to unseen inputs. While large language model agents have shown promise in programming tasks guided by natural language, their ability to perform inductive program synthesis is underexplored. Existing evaluation protocols rely on static sets of examples and held-out tests, offering no feedback when synthesized functions are incorrect and failing to reflect real-world scenarios such as reverse engineering. We propose CodeARC, the Code Abstraction and Reasoning Challenge, a new evaluation framework where agents interact with a hidden target function by querying it with new inputs, synthesizing candidate functions, and iteratively refining their solutions using a differential testing oracle. This interactive setting encourages agents to perform function calls and self-correction based on feedback. We construct the first large-scale benchmark for general-purpose inductive program synthesis, featuring 1114 functions. Among 18 models evaluated, o3-mini performs best with a success rate of 52.7%, highlighting the difficulty of this task. Fine-tuning LLaMA-3.1-8B-Instruct on curated synthesis traces yields up to a 31% relative performance gain. CodeARC provides a more realistic and challenging testbed for evaluating LLM-based program synthesis and inductive reasoning.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2503.23145

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Challenges and Paths Towards AI for Software Engineering

Gu, Alex, Jain, Naman, Li, Wen-Ding, Shetty, Manish, Shao, Yijia, Li, Ziyang, Yang, Diyi, Ellis, Kevin, Sen, Koushik, Solar-Lezama, Armando

arXiv.org Artificial IntelligenceMar-28-2025

AI for software engineering has made remarkable progress recently, becoming a notable success within generative AI. Despite this, there are still many challenges that need to be addressed before automated software engineering reaches its full potential. It should be possible to reach high levels of automation where humans can focus on the critical decisions of what to build and how to balance difficult tradeoffs while most routine development effort is automated away. Reaching this level of automation will require substantial research and engineering efforts across academia and industry. In this paper, we aim to discuss progress towards this in a threefold manner. First, we provide a structured taxonomy of concrete tasks in AI for software engineering, emphasizing the many other tasks in software engineering beyond code generation and completion. Second, we outline several key bottlenecks that limit current approaches. Finally, we provide an opinionated list of promising research directions toward making progress on these bottlenecks, hoping to inspire future research in this rapidly maturing field.

large language model, machine learning, programming language, (21 more...)

arXiv.org Artificial Intelligence

2503.22625

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(10 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.92)
(4 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
(3 more...)

Add feedback

Reasoning Under Threat: Symbolic and Neural Techniques for Cybersecurity Verification

Veronica, Sarah

arXiv.org Artificial IntelligenceMar-27-2025

Cybersecurity demands rigorous and scalable techniques to ensure system correctness, robustness, and resilience against evolving threats. Automated reasoning, encompassing formal logic, theorem proving, model checking, and symbolic analysis, provides a foundational framework for verifying security properties across diverse domains such as access control, protocol design, vulnerability detection, and adversarial modeling. This survey presents a comprehensive overview of the role of automated reasoning in cybersecurity, analyzing how logical systems, including temporal, deontic, and epistemic logics are employed to formalize and verify security guarantees. We examine SOTA tools and frameworks, explore integrations with AI for neural-symbolic reasoning, and highlight critical research gaps, particularly in scalability, compositionality, and multi-layered security modeling. The paper concludes with a set of well-grounded future research directions, aiming to foster the development of secure systems through formal, automated, and explainable reasoning techniques.

artificial intelligence, jin song dong, logic & formal reasoning, (17 more...)

arXiv.org Artificial Intelligence

2503.22755

Country:

Oceania > Australia > Queensland > Brisbane (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Singapore (0.04)
(17 more...)

Genre: Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback