AITopics

2511.09174

Country:

Europe (1.00)
Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.88)

Vossel, Felix, Mossakowski, Till, Gehrke, Björn

Advancing Natural Language Formalization to First Order Logic with Fine-tuned LLMs

arXiv.org Artificial IntelligenceDec-2-2025

Automating the translation of natural language to first-order logic (FOL) is crucial for knowledge representation and formal methods, yet remains challenging. We present a systematic evaluation of fine-tuned LLMs for this task, comparing architectures (encoder-decoder vs. decoder-only) and training strategies. Using the MALLS and Willow datasets, we explore techniques like vocabulary extension, predicate conditioning, and multilingual training, introducing metrics for exact match, logical equivalence, and predicate alignment. Our fine-tuned Flan-T5-XXL achieves 70% accuracy with predicate lists, outperforming GPT-4o and even the DeepSeek-R1-0528 model with CoT reasoning ability as well as symbolic systems like ccg2lambda. Key findings show: (1) predicate availability boosts performance by 15-20%, (2) T5 models surpass larger decoder-only LLMs, and (3) models generalize to unseen logical arguments (FOLIO dataset) without specific training. While structural logic translation proves robust, predicate extraction emerges as the main bottleneck.

large language model, logic & formal reasoning, machine learning, (19 more...)

2509.22338

Country:

North America > Canada (0.28)
Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

AI for software engineering: from probable to provable

Meyer, Bertrand

Vibe coding, the much-touted use of AI techniques for programming, faces two overwhelming obstacles: the difficulty of specifying goals ("prompt engineering" is a form of requirements engineering, one of the toughest disciplines of software engineering); and the hallucination phenomenon. Programs are only useful if they are correct or very close to correct. The solution? Combine the creativity of artificial intelligence with the rigor of formal specification methods and the power of formal program verification, supported by modern proof tools.

large language model, logic & formal reasoning, natural language, (20 more...)

2511.23159

Genre: Research Report (0.64)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.35)

Impure Simplicial Complex and Term-Modal Logic with Assignment Operators

Yang, Yuanzhe

Impure simplicial complexes are a powerful tool to model multi-agent epistemic situations where agents may die, but it is difficult to define a satisfactory semantics for the ordinary propositional modal language on such models, since many conceptually dubious expressions involving dead agents can be expressed in this language. In this paper, we introduce a term-modal language with assignment operators, in which such conceptually dubious expressions are syntactically excluded. We define both simplicial semantics and first-order Kripke semantics for this language, characterize their respective expressivity through notions of bisimulation, and show that the two semantics are equivalent when we consider a special class of first order Kripke models called local epistemic models. We also offer a complete axiomatization for the epistemic logic based on this language, and show that our language has a notion of assignment normal form. Finally, we discuss the behavior of a kind of intensional distributed knowledge that can be naturally expressed in our language.

artificial intelligence, logic & formal reasoning, nullx, (14 more...)

doi: 10.4204/EPTCS.437.31

2511.22391

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)

Behnke, Gregor, Gattinger, Malvin, Ghosh, Avijeet, Wang, Haitian

Comparing State-Representations for DEL Model Checking

Model checking with the standard Kripke models used in (Dynamic) Epistemic Logic leads to scalability issues. Hence alternative representations have been developed, in particular symbolic structures based on Binary Decision Diagrams (BDDs) and succinct models based on mental programs. While symbolic structures have been shown to perform well in practice, their theoretical complexity was not known so far. On the other hand, for succinct models model checking is known to be PSPACE-complete, but no implementations are available. We close this gap and directly relate the two representations. We show that model checking DEL on symbolic structures encoded with BDDs is also PSPACE-complete. In fact, already model checking Epistemic Logic without dynamics is PSPACE-complete on symbolic structures. We also provide direct translations between BDDs and mental programs. Both translations yield exponential outputs. For the translation from mental programs to BDDs we show that no small translation exists. For the other direction we conjecture the same.

artificial intelligence, logic & formal reasoning, logic programming, (17 more...)

doi: 10.4204/EPTCS.437.21

2511.22382

Country: Europe (0.68)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Lorini, Emiliano, Rozplokhas, Dmitry

Graded Distributed Belief

The idea of using belief bases as formal semantics for multi-agent epistemic logic was first introduced in [26] and further developed in [27, 28]. This approach aligns with the sentential (or syntactic) perspective on knowledge representation [21, 13, 33, 20], which holds th at an agent's body of knowledge should be represented as a set of sentences in a formal language. The key novelty of belief base semantics, compared to traditional epistemic logic semantics based on multi-relational Kripke models [31, 12], lies in two main aspects. First, a possible world (or state) in a mo del is not treated as a primitive entity but is instead composed of the agents' belief bases and a valu ation of propositional atoms. Second, the agents' accessibility relations are not explicitly par t of the model but are determined a posteriori from their belief bases.

artificial intelligence, belief base, logic & formal reasoning, (15 more...)

doi: 10.4204/EPTCS.437.19

2511.22381

Country:

North America > United States (0.28)
Europe > Netherlands (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.88)

Wüst, Antonia, Stammer, Wolfgang, Shindo, Hikaru, Helff, Lukas, Dhami, Devendra Singh, Kersting, Kristian

Synthesizing Visual Concepts as Vision-Language Programs

Vision-Language models (VLMs) achieve strong performance on multimodal tasks but often fail at systematic visual reasoning tasks, leading to inconsistent or illogical outputs. Neuro-symbolic methods promise to address this by inducing interpretable logical rules, though they exploit rigid, domain-specific perception modules. We propose Vision-Language Programs (VLP), which combine the perceptual flexibility of VLMs with systematic reasoning of program synthesis. Rather than embedding reasoning inside the VLM, VLP leverages the model to produce structured visual descriptions that are compiled into neuro-symbolic programs. The resulting programs execute directly on images, remain consistent with task constraints, and provide human-interpretable explanations that enable easy shortcut mitigation. Experiments on synthetic and real-world datasets demonstrate that VLPs outperform direct and structured prompting, particularly on tasks requiring complex logical reasoning.

large language model, logic & formal reasoning, machine learning, (20 more...)

2511.18964

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

ObjectAlign: Neuro-Symbolic Object Consistency Verification and Correction

Munir, Mustafa, Goel, Harsh, Wei, Xiwen, Choi, Minkyu, Shah, Sahil, Bhardwaj, Kartikeya, Whatmough, Paul, Chinchali, Sandeep, Marculescu, Radu

Video editing and synthesis often introduce object inconsistencies, such as frame flicker and identity drift that degrade perceptual quality. To address these issues, we introduce ObjectAlign, a novel framework that seamlessly blends perceptual metrics with symbolic reasoning to detect, verify, and correct object-level and temporal inconsistencies in edited video sequences. The novel contributions of ObjectAlign are as follows: First, we propose learnable thresholds for metrics characterizing object consistency (i.e. CLIP-based semantic similarity, LPIPS perceptual distance, histogram correlation, and SAM-derived object-mask IoU). Second, we introduce a neuro-symbolic verifier that combines two components: (a) a formal, SMT-based check that operates on masked object embeddings to provably guarantee that object identity does not drift, and (b) a temporal fidelity check that uses a probabilistic model checker to verify the video's formal representation against a temporal logic specification. A frame transition is subsequently deemed "consistent" based on a single logical assertion that requires satisfying both the learned metric thresholds and this unified neuro-symbolic constraint, ensuring both low-level stability and high-level temporal correctness. Finally, for each contiguous block of flagged frames, we propose a neural network based interpolation for adaptive frame repair, dynamically choosing the interpolation depth based on the number of frames to be corrected. This enables reconstruction of the corrupted frames from the last valid and next valid keyframes. Our results show up to 1.4 point improvement in CLIP Score and up to 6.1 point improvement in warp error compared to SOTA baselines on the DAVIS and Pexels video datasets.

logic & formal reasoning, machine learning, natural language, (20 more...)

2511.18701

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)

Ramani, Keshav, Tawosi, Vali, Alamir, Salwa, Borrajo, Daniel

Bridging LLM Planning Agents and Formal Methods: A Case Study in Plan Verification

We introduce a novel framework for evaluating the alignment between natural language plans and their expected behavior by converting them into Kripke structures and Linear Temporal Logic (LTL) using Large Language Models (LLMs) and performing model checking. We systematically evaluate this framework on a simplified version of the PlanBench plan verification dataset and report on metrics like Accuracy, Precision, Recall and F1 scores. Our experiments demonstrate that GPT-5 achieves excellent classification performance (F1 score of 96.3%) while almost always producing syntactically perfect formal representations that can act as guarantees. However, the synthesis of semantically perfect formal models remains an area for future exploration.

large language model, logic & formal reasoning, machine learning, (18 more...)

2510.03469

Country: Europe (0.47)

Genre: Research Report > New Finding (0.68)

Industry: Banking & Finance (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Bienvenu, Meghyn, Bourgaux, Camille, Inoue, Katsumi, Jean, Robin

A Rule-Based Approach to Specifying Preferences over Conflicting Facts and Querying Inconsistent Knowledge Bases

Repair-based semantics have been extensively studied as a means of obtaining meaningful answers to queries posed over inconsistent knowledge bases (KBs). While several works have considered how to exploit a priority relation between facts to select optimal repairs, the question of how to specify such preferences remains largely unaddressed. This motivates us to introduce a declarative rule-based framework for specifying and computing a priority relation between conflicting facts. As the expressed preferences may contain undesirable cycles, we consider the problem of determining when a set of preference rules always yields an acyclic relation, and we also explore a pragmatic approach that extracts an acyclic relation by applying various cycle removal techniques. Towards an end-to-end system for querying inconsistent KBs, we present a preliminary implementation and experimental evaluation of the framework, which employs answer set programming to evaluate the preference rules, apply the desired cycle resolution techniques to obtain a priority relation, and answer queries under prioritized-repair semantics.

artificial intelligence, conflict, logic & formal reasoning, (15 more...)

2508.07742

Country:

Europe > France (0.28)
Asia > Japan (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)