AITopics | presupposition

Collaborating Authors

presupposition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LLMs Struggle to Reject False Presuppositions when Misinformation Stakes are High

Sieker, Judith, Lachenmaier, Clara, Zarrieß, Sina

arXiv.org Artificial IntelligenceNov-13-2025

This paper examines how LLMs handle false presuppositions and whether certain linguistic factors influence their responses to falsely presupposed content. Presuppositions subtly introduce information as given, making them highly effective at embedding disputable or false information. This raises concerns about whether LLMs, like humans, may fail to detect and correct misleading assumptions introduced as false presuppositions, even when the stakes of misinformation are high. Using a systematic approach based on linguistic presupposition analysis, we investigate the conditions under which LLMs are more or less sensitive to adopt or reject false presuppositions. Focusing on political contexts, we examine how factors like linguistic construction, political party, and scenario probability impact the recognition of false presuppositions. We conduct experiments with a newly created dataset and examine three LLMs: OpenAI's GPT-4-o, Meta's LLama-3-8B, and MistralAI's Mistral-7B-v03. Our results show that the models struggle to recognize false presuppositions, with performance varying by condition. This study highlights that linguistic presupposition analysis is a valuable tool for uncovering the reinforcement of political misinformation in LLM responses.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.22354

Country:

North America > Mexico (0.28)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.95)

Industry:

Media > News (1.00)
Government (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

P-ReMIS: Pragmatic Reasoning in Mental Health and a Social Implication

Oram, Sneha, Bhattacharyya, Pushpak

arXiv.org Artificial IntelligenceNov-10-2025

Although explainability and interpretability have received significant attention in artificial intelligence (AI) and natural language processing (NLP) for mental health, reasoning has not been examined in the same depth. Addressing this gap is essential to bridge NLP and mental health through interpretable and reasoning-capable AI systems. To this end, we investigate the pragmatic reasoning capability of large-language models (LLMs) in the mental health domain. We introduce PRiMH dataset, and propose pragmatic reasoning tasks in mental health with pragmatic implicature and presupposition phenomena. In particular, we formulate two tasks in implicature and one task in presupposition. To benchmark the dataset and the tasks presented, we consider four models: Llama3.1, Mistral, MentaLLaMa, and Qwen. The results of the experiments suggest that Mistral and Qwen show substantial reasoning abilities in the domain. Subsequently, we study the behavior of MentaLLaMA on the proposed reasoning tasks with the rollout attention mechanism. In addition, we also propose three StiPRompts to study the stigma around mental health with the state-of-the-art LLMs, GPT4o-mini, Deepseek-chat, and Claude-3.5-haiku. Our evaluated findings show that Claude-3.5-haiku deals with stigma more responsibly compared to the other two LLMs.

large language model, machine learning, mental health, (17 more...)

arXiv.org Artificial Intelligence

2507.23247

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Safer in Translation? Presupposition Robustness in Indic Languages

Palnitkar, Aadi, Suresh, Arjun, Rajesh, Rishi, Puli, Puneet

arXiv.org Artificial IntelligenceNov-4-2025

Increasingly, more and more people are turning to large language models (LLMs) for healthcare advice and consultation, making it important to gauge the efficacy and accuracy of the responses of LLMs to such queries. While there are pre-existing medical benchmarks literature which seeks to accomplish this very task, these benchmarks are almost universally in English, which has led to a notable gap in existing literature pertaining to multilingual LLM evaluation. Within this work, we seek to aid in addressing this gap with Cancer-Myth-Indic, an Indic language benchmark built by translating a 500-item subset of Cancer-Myth, sampled evenly across its original categories, into five under-served but widely used languages from the subcontinent (500 per language; 2,500 translated items total). Native-speaker translators followed a style guide for preserving implicit presuppositions in translation; items feature false presuppositions relating to cancer. We evaluate several popular LLMs under this presupposition stress.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.0136

Country:

Asia (0.29)
North America > United States > Maryland (0.28)

Genre: Research Report (0.52)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.39)

Add feedback

Implementing a Logical Inference System for Japanese Comparatives

Mikami, Yosuke, Matsuoka, Daiki, Yanaka, Hitomi

arXiv.org Artificial IntelligenceSep-18-2025

Natural Language Inference (NLI) involving comparatives is challenging because it requires understanding quantities and comparative relations expressed by sentences. While some approaches leverage Large Language Models (LLMs), we focus on logic-based approaches grounded in compositional semantics, which are promising for robust handling of numerical and logical expressions. Previous studies along these lines have proposed logical inference systems for English comparatives. However, it has been pointed out that there are several morphological and semantic differences between Japanese and English comparatives. These differences make it difficult to apply such systems directly to Japanese comparatives. To address this gap, this study proposes ccg-jcomp, a logical inference system for Japanese comparatives based on compositional semantics. We evaluate the proposed system on a Japanese NLI dataset containing comparative expressions. We demonstrate the effectiveness of our system by comparing its accuracy with that of existing LLMs.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.13734

Country:

Asia > Japan (0.15)
Europe > Portugal (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Measuring Sycophancy of Language Models in Multi-turn Dialogues

Hong, Jiseung, Byun, Grace, Kim, Seungone, Shu, Kai, Choi, Jinho D.

arXiv.org Artificial IntelligenceAug-27-2025

Large Language Models (LLMs) are expected to provide helpful and harmless responses, yet they often exhibit sycophancy--conforming to user beliefs regardless of factual accuracy or ethical soundness. Prior research on sycophancy has primarily focused on single-turn factual correctness, overlooking the dynamics of real-world interactions. In this work, we introduce SYCON Bench, a novel benchmark for evaluating sycophantic behavior in multi-turn, free-form conversational settings. Our benchmark measures how quickly a model conforms to the user (Turn of Flip) and how frequently it shifts its stance under sustained user pressure (Number of Flip). Applying SYCON Bench to 17 LLMs across three real-world scenarios, we find that sycophancy remains a prevalent failure mode. Our analysis shows that alignment tuning amplifies sycophantic behavior, whereas model scaling and reasoning optimization strengthen the model's ability to resist undesirable user views. Reasoning models generally outperform instruction-tuned models but often fail when they over-index on logical exposition instead of directly addressing the user's underlying beliefs. Finally, we evaluate four additional prompting strategies and demonstrate that adopting a third-person perspective reduces sycophancy by up to 63.8% in debate scenario. We release our code and data at https://github.com/JiseungHong/SYCON-Bench.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.2384

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Energy > Renewable (1.00)
Energy > Power Industry (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Can LLMs Ground when they (Don't) Know: A Study on Direct and Loaded Political Questions

Lachenmaier, Clara, Sieker, Judith, Zarrieß, Sina

arXiv.org Artificial IntelligenceJun-12-2025

Communication among humans relies on conversational grounding, allowing interlocutors to reach mutual understanding even when they do not have perfect knowledge and must resolve discrepancies in each other's beliefs. This paper investigates how large language models (LLMs) manage common ground in cases where they (don't) possess knowledge, focusing on facts in the political domain where the risk of misinformation and grounding failure is high. We examine the ability of LLMs to answer direct knowledge questions and loaded questions that presuppose misinformation. We evaluate whether loaded questions lead LLMs to engage in active grounding and correct false user beliefs, in connection to their level of knowledge and their political bias. Our findings highlight significant challenges in LLMs' ability to engage in grounding and reject false user beliefs, raising concerns about their role in mitigating misinformation in political discourse.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.08952

Country:

Asia (1.00)
North America > United States (0.28)
Europe > Germany (0.28)
North America > Mexico (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Conjoined Predication and Scalar Implicature

Kandala, Ratna

arXiv.org Artificial IntelligenceJun-10-2025

Magri (2016) has discussed two puzzles raised by conjunction. While the first puzzle has not been resolved, a solution to the second puzzle has been proposed by Magri. The first puzzle conceals an interrelationship between quantification, collective/concurrent interpretation, and contextual updates, the aspects of which have not been explored. In brief, the puzzle is that certain variants of sentences such as # Some Italians come from a warm country involving conjunction as in # (Only) Some Italians come from a warm country and are blond remain odd despite the fact that no alternative seems to trigger the mismatching scalar implicature. In this paper, we o ffer a conceptual analysis of Magri's first puzzle, by first presenting it in the context of th e theory in which it arises . This paper proposes that the oddness arises due to the collective - concurrent interpretation of the conjunctive predicate, as underlined in # (Only) Some Italians come from a warm country and are blond that ends up giving rise to an indirect contextual contradiction. It is suggested that the generation of scalar implicatures may have pragmatically governed facets not fully conditioned by accounts of exhaustification - based grammatical licensing of scalar implicatures . Introduction Magri (2016) has discussed two puzzles raised by conjunction which we discuss in brief.

artificial intelligence, natural language, warm country, (18 more...)

arXiv.org Artificial Intelligence

2506.07429

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.47)

Add feedback

They want to pretend not to understand: The Limits of Current LLMs in Interpreting Implicit Content of Political Discourse

Paci, Walter, Panunzi, Alessandro, Pezzelle, Sandro

arXiv.org Artificial IntelligenceJun-10-2025

Implicit content plays a crucial role in political discourse, where speakers systematically employ pragmatic strategies such as implicatures and presuppositions to influence their audiences. Large Language Models (LLMs) have demonstrated strong performance in tasks requiring complex semantic and pragmatic understanding, highlighting their potential for detecting and explaining the meaning of implicit content. However, their ability to do this within political discourse remains largely underexplored. Leveraging, for the first time, the large IMPAQTS corpus, which comprises Italian political speeches with the annotation of manipulative implicit content, we propose methods to test the effectiveness of LLMs in this challenging problem. Through a multiple-choice task and an open-ended generation task, we demonstrate that all tested models struggle to interpret presuppositions and implicatures. We conclude that current LLMs lack the key pragmatic capabilities necessary for accurately interpreting highly implicit language, such as that found in political discourse. At the same time, we highlight promising trends and future directions for enhancing model performance. We release our data and code at https://github.com/WalterPaci/IMPAQTS-PID

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.06775

Country:

Europe (1.00)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Let's CONFER: A Dataset for Evaluating Natural Language Inference Models on CONditional InFERence and Presupposition

Azin, Tara, Dumitrescu, Daniel, Inkpen, Diana, Singh, Raj

arXiv.org Artificial IntelligenceJun-9-2025

Natural Language Inference (NLI) is the task of determining whether a sentence pair represents entailment, contradiction, or a neutral relationship. While NLI models perform well on many inference tasks, their ability to handle fine-grained pragmatic inferences, particularly presupposition in conditionals, remains underexplored. In this study, we introduce CONFER, a novel dataset designed to evaluate how NLI models process inference in conditional sentences. We assess the performance of four NLI models, including two pre-trained models, to examine their generalization to conditional reasoning. Additionally, we evaluate Large Language Models (LLMs), including GPT-4o, LLaMA, Gemma, and DeepSeek-R1, in zero-shot and few-shot prompting settings to analyze their ability to infer presuppositions with and without prior context. Our findings indicate that NLI models struggle with presuppositional reasoning in conditionals, and fine-tuning on existing NLI datasets does not necessarily improve their performance.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.06133

Country:

Europe (0.46)
North America > Canada > Ontario (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ChatGPT for President! Presupposed content in politicians versus GPT-generated texts

Garassino, Davide, Brocca, Nicola, Masia, Viviana

arXiv.org Artificial IntelligenceMar-3-2025

This study examines ChatGPT-4's capability to replicate linguistic strategies used in political discourse, focusing on its potential for manipulative language generation. As large language models become increasingly popular for text generation, concerns have grown regarding their role in spreading fake news and propaganda. This research compares real political speeches with those generated by ChatGPT, emphasizing presuppositions (a rhetorical device that subtly influences audiences by packaging some content as already known at the moment of utterance, thus swaying opinions without explicit argumentation). Using a corpus-based pragmatic analysis, this study assesses how well ChatGPT can mimic these persuasive strategies. The findings reveal that although ChatGPT-generated texts contain many manipulative presuppositions, key differences emerge in their frequency, form, and function compared with those of politicians. For instance, ChatGPT often relies on change-of-state verbs used in fixed phrases, whereas politicians use presupposition triggers in more varied and creative ways. Such differences, however, are challenging to detect with the naked eye, underscoring the potential risks posed by large language models in political and public discourse.Using a corpus-based pragmatic analysis, this study assesses how well ChatGPT can mimic these persuasive strategies. The findings reveal that although ChatGPT-generated texts contain many manipulative presuppositions, key differences emerge in their frequency, form, and function compared with those of politicians. For instance, ChatGPT often relies on change-of-state verbs used in fixed phrases, whereas politicians use presupposition triggers in more varied and creative ways. Such differences, however, are challenging to detect with the naked eye, underscoring the potential risks posed by large language models in political and public discourse.

politician, presupposition, speech, (14 more...)

arXiv.org Artificial Intelligence

2503.01269

Country:

Europe > France (0.68)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(5 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Media > News (0.66)
Government > Regional Government > Europe Government > France Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback