McPheat, Lachlan
An Empirical Study of Conformal Prediction in LLM with ASP Scaffolds for Robust Reasoning
Kaur, Navdeep, McPheat, Lachlan, Russo, Alessandra, Cohn, Anthony G, Madhyastha, Pranava
In this paper, we examine the use of Conformal Language Modelling (CLM) alongside Answer Set Programming (ASP) to enhance the performance of standard open-weight LLMs on complex multi-step reasoning tasks. Using the StepGame dataset, which requires spatial reasoning, we apply CLM to generate sets of ASP programs from an LLM, providing statistical guarantees on the correctness of the outputs. Experimental results show that CLM significantly outperforms baseline models that use standard sampling methods, achieving substantial accuracy improvements across different levels of reasoning complexity. Additionally, the LLM-as-Judge metric enhances CLM's performance, especially in assessing structurally and logically correct ASP outputs. However, calibrating CLM with diverse calibration sets did not improve generalizability for tasks requiring much longer reasoning steps, indicating limitations in handling more complex tasks.
Vector Space Semantics for Lambek Calculus with Soft Subexponentials
McPheat, Lachlan, Wazni, Hadi, Sadrzadeh, Mehrnoosh
We develop a vector space semantics for Lambek Calculus with Soft Subexponentials, apply the calculus to construct compositional vector interpretations for parasitic gap noun phrases and discourse units with anaphora and ellipsis, and experiment with the constructions in a distributional sentence similarity task. As opposed to previous work, which used Lambek Calculus with a Relevant Modality the calculus used in this paper uses a bounded version of the modality and is decidable. The vector space semantics of this new modality allows us to meaningfully define contraction as projection and provide a linear theory behind what we could previously only achieve via nonlinear maps.
DisCoCat for Donkey Sentences
McPheat, Lachlan, Wang, Daphne
Montague semantics is a compositional method to translate the semantics of written language into first order logic. As a simple example one can understand the meaning of the sentence "(all) dogs eat snacks" as x, y.dogs(x) snacks(y) eats(x, y). However, when translating the meaning of the sentence "Every farmer who owns a donkey beats it", the variable representing the donkey cannot be bound by the existential quantifier coming from the determiner'a'. This issue was studied by Geach [4], using it as a counterexample to the scope of Montague semantics. Many have created systems that form semantic representations of donkey sentences, to name a few we have dynamic predicate logic [7], where the binding rules of quantifiers in first order logic are relaxed, discourse representation theory [11] where an collection of'discourse referents' keep track of individuals' mentions and are identified to keep track of references, as well as an approach using dependent type theory [18], exploiting dependent sums to differentiate between ambiguous readings of donkey sentences. However, none of the models mentioned above are type-logical grammars which poses the question whether it is possible to parse donkey sentences and form usable representations of them using type logical grammars? We propose to model donkey sentences using (an extension of) Lambek calculus, L. In the following section, we explain how a type-logical analysis of natural language works, and in sections 1.3,1.4,1.5 how to extend it to model more exotic linguistic phenomena, culminating in a parse of a donkey sentence. Then we introduce relational semantics and vector space semantics of the extended Lambek calculus in sections 3.1 and 3.3 respectively, demonstrating how donkey sentence is interpreted as a relation or as a linear map.
Categorical Vector Space Semantics for Lambek Calculus with a Relevant Modality
McPheat, Lachlan, Sadrzadeh, Mehrnoosh, Wazni, Hadi, Wijnholds, Gijs
We develop a categorical compositional distributional semantics for Lambek Calculus with a Relevant Modality !L*, which has a limited edition of the contraction and permutation rules. The categorical part of the semantics is a monoidal biclosed category with a coalgebra modality, very similar to the structure of a Differential Category. We instantiate this category to finite dimensional vector spaces and linear maps via "quantisation" functors and work with three concrete interpretations of the coalgebra modality. We apply the model to construct categorical and concrete semantic interpretations for the motivating example of !L*: the derivation of a phrase with a parasitic gap. The effectiveness of the concrete interpretations are evaluated via a disambiguation task, on an extension of a sentence disambiguation dataset to parasitic gap phrases, using BERT, Word2Vec, and FastText vectors and Relational tensors.