order logic
Advancing Natural Language Formalization to First Order Logic with Fine-tuned LLMs
Vossel, Felix, Mossakowski, Till, Gehrke, Björn
Automating the translation of natural language to first-order logic (FOL) is crucial for knowledge representation and formal methods, yet remains challenging. We present a systematic evaluation of fine-tuned LLMs for this task, comparing architectures (encoder-decoder vs. decoder-only) and training strategies. Using the MALLS and Willow datasets, we explore techniques like vocabulary extension, predicate conditioning, and multilingual training, introducing metrics for exact match, logical equivalence, and predicate alignment. Our fine-tuned Flan-T5-XXL achieves 70% accuracy with predicate lists, outperforming GPT-4o and even the DeepSeek-R1-0528 model with CoT reasoning ability as well as symbolic systems like ccg2lambda. Key findings show: (1) predicate availability boosts performance by 15-20%, (2) T5 models surpass larger decoder-only LLMs, and (3) models generalize to unseen logical arguments (FOLIO dataset) without specific training. While structural logic translation proves robust, predicate extraction emerges as the main bottleneck.
Interactive Query Answering on Knowledge Graphs with Soft Entity Constraints
Daza, Daniel, Bernardi, Alberto, Costabello, Luca, Gueret, Christophe, Mansoury, Masoud, Cochez, Michael, Schut, Martijn
Methods for query answering over incomplete knowledge graphs retrieve entities that are \emph{likely} to be answers, which is particularly useful when such answers cannot be reached by direct graph traversal due to missing edges. However, existing approaches have focused on queries formalized using first-order-logic. In practice, many real-world queries involve constraints that are inherently vague or context-dependent, such as preferences for attributes or related categories. Addressing this gap, we introduce the problem of query answering with soft constraints. We formalize the problem and introduce two efficient methods designed to adjust query answer scores by incorporating soft constraints without disrupting the original answers to a query. These methods are lightweight, requiring tuning only two parameters or a small neural network trained to capture soft constraints while maintaining the original ranking structure. To evaluate the task, we extend existing QA benchmarks by generating datasets with soft constraints. Our experiments demonstrate that our methods can capture soft constraints while maintaining robust query answering performance and adding very little overhead. With our work, we explore a new and flexible way to interact with graph databases that allows users to specify their preferences by providing examples interactively.
On the Limits of Hierarchically Embedded Logic in Classical Neural Networks
We propose a formal model of reasoning limitations in large neural net models for language, grounded in the depth of their neural architecture. By treating neural networks as linear operators over logic predicate space we show that each layer can encode at most one additional level of logical reasoning. We prove that a neural network of depth a particular depth cannot faithfully represent predicates in a one higher order logic, such as simple counting over complex predicates, implying a strict upper bound on logical expressiveness. This structure induces a nontrivial null space during tokenization and embedding, excluding higher-order predicates from representability. Our framework offers a natural explanation for phenomena such as hallucination, repetition, and limited planning, while also providing a foundation for understanding how approximations to higher-order logic may emerge. These results motivate architectural extensions and interpretability strategies in future development of language models.
Graph Neural Networks with polynomial activations have limited expressivity
The expressivity of Graph Neural Networks (GNNs) can be entirely characterized by appropriate fragments of the first order logic. Namely, any query of the two variable fragment of graded modal logic (GC2) interpreted over labeled graphs can be expressed using a GNN whose size depends only on the depth of the query. As pointed out by [Barcelo & Al., 2020, Grohe, 2021], this description holds for a family of activation functions, leaving the possibibility for a hierarchy of logics expressible by GNNs depending on the chosen activation function. In this article, we show that such hierarchy indeed exists by proving that GC2 queries cannot be expressed by GNNs with polynomial activation functions. This implies a separation between polynomial and popular non polynomial activations (such as Rectified Linear Units) and answers an open question formulated by [Grohe, 21].
Deep Adaptive Semantic Logic (DASL): Compiling Declarative Knowledge into Deep Neural Networks
Sikka, Karan, Silberfarb, Andrew, Byrnes, John, Sur, Indranil, Chow, Ed, Divakaran, Ajay, Rohwer, Richard
We introduce Deep Adaptive Semantic Logic (DASL), a novel framework for automating the generation of deep neural networks that incorporates user-provided formal knowledge to improve learning from data. We provide formal semantics that demonstrate that our knowledge representation captures all of first order logic and that finite sampling from infinite domains converges to correct truth values. DASL's representation improves on prior neural-symbolic work by avoiding vanishing gradients, allowing deeper logical structure, and enabling richer interactions between the knowledge and learning components. We illustrate DASL through a toy problem in which we add structure to an image classification problem and demonstrate that knowledge of that structure reduces data requirements by a factor of $1000$. We then evaluate DASL on a visual relationship detection task and demonstrate that the addition of commonsense knowledge improves performance by $10.7\%$ in a data scarce setting.
The Higher-Order Prover Leo-III (Extended Version)
Steen, Alexander, Benzmüller, Christoph
The automated theorem prover Leo-III for classical higher-order logic with Henkin semantics and choice is presented. Leo-III is based on extensional higher-order paramodulation and accepts every common TPTP dialect (FOF, TFF, THF), including their recent extensions to rank-1 polymorphism (TF1, TH1). In addition, the prover natively supports almost every normal higher-order modal logic. Leo-III cooperates with first-order reasoning tools using translations to many-sorted first-order logic and produces verifiable proof certificates. The prover is evaluated on heterogeneous benchmark sets.
Towards a Model Theory for Distributed Representations
Distributed representations (such as those based on embeddings) and discrete representations (such as those based on logic) have complementary strengths. We explore one possible approach to combining these two kinds of representations. We present a model theory/semantics for first order logic based on vectors of reals. We describe the model theory, discuss some interesting properties of such a system and present a simple approach to query answering.
The Complexity of Reasoning with FODD and GFODD
Hescott, Benjamin J. (Tufts University) | Khardon, Roni (Tufts University)
Recent work introduced Generalized First Order Decision Diagrams (GFODD) as a knowledge representation that is useful in mechanizing decision theoretic planning in relational domains. GFODDs generalize function-free first order logic and include numerical values and numerical generalizations of existential and universal quantification. Previous work presented heuristic inference algorithms for GFODDs. In this paper, we study the complexity of the evaluation problem, the satiability problem, and the equivalence problem for GFODDs under the assumption that the size of the intended model is given with the problem, a restriction that guarantees decidability. Our results provide a complete characterization. The same characterization applies to the corresponding restriction of problems in first order logic, giving an interesting new avenue for efficient inference when the number of objects is bounded. Our results show that for Σk formulas, and for corresponding GFODDs, evaluation and satisfiability are Σkp complete, and equivalence is Πk+1p complete. For Πk formulas evaluation is Πkp complete, satisfiability is one level higher and is Σk+1p complete, and equivalence is Πk+1p complete.
Lp : A Logic for Statistical Information
This extended abstract presents a logic, called Lp, that is capable of representing and reasoning with a wide variety of both qualitative and quantitative statistical information. The advantage of this logical formalism is that it offers a declarative representation of statistical knowledge; knowledge represented in this manner can be used for a variety of reasoning tasks. The logic differs from previous work in probability logics in that it uses a probability distribution over the domain of discourse, whereas most previous work (e.g., Nilsson [2], Scott et al. [3], Gaifinan [4], Fagin et al. [5]) has investigated the attachment of probabilities to the sentences of the logic (also, see Halpern [6] and Bacchus [7] for further discussion of the differences). The logic Lp possesses some further important features. First, Lp is a superset of first order logic, hence it can represent ordinary logical assertions. This means that Lp provides a mechanism for integrating statistical information and reasoning about uncertainty into systems based solely on logic. Second, Lp possesses transparent semantics, based on sets and probabilities of those sets. Hence, knowledge represented in Lp can be understood in terms of the simple primative concepts of sets and probabilities. And finally, the there is a sound proof theory that has wide coverage (the proof theory is complete for certain classes of models). The proof theory captures a sufficient range of valid inferences to subsume most previous probabilistic uncertainty reasoning systems. For example, the linear constraints like those generated by Nilsson's probabilistic entailment [2] can be generated by the proof theory, and the Bayesian inference underlying belief nets [8] can be performed. In addition, the proof theory integrates quantitative and qualitative reasoning as well as statistical and logical reasoning. In the next section we briefly examine previous work in probability logics, comparing it to Lp. Then we present some of the varieties of statistical information that Lp is capable of expressing. After this we present, briefly, the syntax, semantics, and proof theory of the logic. We conclude with a few examples of knowledge representation and reasoning in Lp, pointing out the advantages of the declarative representation offered by Lp. We close with a brief discussion of probabilities as degrees of belief, indicating how such probabilities can be generated from statistical knowledge encoded in Lp. The reader who is interested in a more complete treatment should consult Bacchus [7].
Heterogeneous knowledge representation using a finite automaton and first order logic: a case study in electromyography
Rialle, Vincent, Vila, Annick, Besnard, Yves
In a certain number of situations, human cognitive functioning is difficult to represent with classical artificial intelligence structures. Such a difficulty arises in the polyneuropathy diagnosis which is based on the spatial distribution, along the nerve fibres, of lesions, together with the synthesis of several partial diagnoses. Faced with this problem while building up an expert system (NEUROP), we developed a heterogeneous knowledge representation associating a finite automaton with first order logic. A number of knowledge representation problems raised by the electromyography test features are examined in this study and the expert system architecture allowing such a knowledge modeling are laid out. Keywords: Medical expert systems, Heterogeneous knowledge representation, Finite automata, Electromyography. 1. Introduction The various kinds of knowledge and reasoning used in expert systems (ES) have been carefully analyzed and classified over several years [6,11,17]. Nevertheless some types of knowledge remain difficult to represent by means of classical structures (production rules, frames, semantic nets, etc.) commonly used in expert systems.