Goto

Collaborating Authors

 relevance theory


Pragmatic Theories Enhance Understanding of Implied Meanings in LLMs

arXiv.org Artificial Intelligence

The ability to accurately interpret implied meanings plays a crucial role in human communication and language use, and language models are also expected to possess this capability. This study demonstrates that providing language models with pragmatic theories as prompts is an effective in-context learning approach for tasks to understand implied meanings. Specifically, we propose an approach in which an overview of pragmatic theories, such as Gricean pragmatics and Relevance Theory, is presented as a prompt to the language model, guiding it through a step-by-step reasoning process to derive a final interpretation. Experimental results showed that, compared to the baseline, which prompts intermediate reasoning without presenting pragmatic theories (0-shot Chain-of-Thought), our methods enabled language models to achieve up to 9.6\% higher scores on pragmatic reasoning tasks. Furthermore, we show that even without explaining the details of pragmatic theories, merely mentioning their names in the prompt leads to a certain performance improvement (around 1-3%) in larger models compared to the baseline.


Schüller

AAAI Conferences

We study disambiguating of pronoun references in Winograd Schemas, which are part of the Winograd Schema Challenge, a proposed replacement for the Turing test. In particular we consider sentences where the pronoun can be resolved to both antecedents without semantic violations in world knowledge, that means for both readings of the sentence there is a possible consistent world. Nevertheless humans will strongly prefer one answer, which can be explained by pragmatic effects described in Relevance Theory. We state formal optimization criteria based on principles of Relevance Theory in a simplification of Roger Schank's graph framework for natural language understanding. We perform experiments using Answer Set Programming and report the usefulness of our criteria for disambiguation and their sensitivity to parameter variations.