AITopics | contradict

Collaborating Authors

contradict

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ActivelyIdentifyingCausalEffectswithLatent VariablesGivenOnlyResponseVariableObservable

Neural Information Processing SystemsFeb-9-2026, 13:05:01 GMT

Thistask is challenging because the causal graph is unknown and even there may exist latent confounders.

artificial intelligence, inm, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

Axiomatisation for an asynchronous epistemic logic with sending and receiving messages

Balbiani, Philippe, van Ditmarsch, Hans, Lerouvillois, Clara

arXiv.org Artificial IntelligenceOct-6-2025

We investigate a public announcement logic for asynchronous public announcements wherein the sending of the announcements by the environment is separated from the reception of the announcements by the individual agents. Both come with different modalities. In the logical semantics, formulas are interpreted in a world of a Kripke model but given a history of prior announcements and receptions of announcements that already happened. An axiomatisation AA for such a logic has been given in prior work, for the formulas that are valid when interpreted in the Kripke model before any such announcements have taken place. This axiomatisation is a reduction system wherein one can show that every formula is equivalent to a purely epistemic formula without dynamic modalities for announcements and receptions. We propose a generalisation AA* of this axiomatisation, for the formulas that are valid when interpreted in the Kripke model given any history of prior announcements and receptions of announcements. It does not extend the axiomatisation AA, for example it is no longer valid that nobody has received any announcement. Unlike AA, this axiomatisation AA* is infinitary and it is not a reduction system.

artificial intelligence, induction hypothesis, maximal consistent theory, (15 more...)

arXiv.org Artificial Intelligence

2510.0289

Country: Europe (0.67)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Actively Identifying Causal Effects with Latent Variables Given Only Response Variable Observable

Neural Information Processing SystemsAug-15-2025, 10:53:03 GMT

Identifying causal effects is one prominent task throughout empirical sciences.

causal effect, collider, contradict, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Existing LLMs Are Not Self-Consistent For Simple Tasks

Lin, Zhenru, Tao, Jiawen, Yuan, Yang, Yao, Andrew Chi-Chih

arXiv.org Artificial IntelligenceJun-24-2025

Large Language Models (LLMs) have grown increasingly powerful, yet ensuring their decisions remain transparent and trustworthy requires self-consistency -- no contradictions in their internal reasoning. Our study reveals that even on simple tasks, such as comparing points on a line or a plane, or reasoning in a family tree, all smaller models are highly inconsistent, and even state-of-the-art models like DeepSeek-R1 and GPT-o4-mini are not fully self-consistent. To quantify and mitigate these inconsistencies, we introduce inconsistency metrics and propose two automated methods -- a graph-based and an energy-based approach. While these fixes provide partial improvements, they also highlight the complexity and importance of self-consistency in building more reliable and interpretable AI. The code and data are available at https://github.com/scorpio-nova/llm-self-consistency.

large language model, machine learning, relation, (21 more...)

arXiv.org Artificial Intelligence

2506.18781

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Temporal Entailment Pretraining for Clinical Language Models over EHR Data

Tanaka, Tatsunori, Zheng, Fi, Sato, Kai, Li, Zhifeng, Zhang, Yuanyun, Li, Shi

arXiv.org Artificial IntelligenceApr-28-2025

Clinical language models have achieved strong performance on downstream tasks by pretraining on domain specific corpora such as discharge summaries and medical notes. However, most approaches treat the electronic health record as a static document, neglecting the temporally-evolving and causally entwined nature of patient trajectories. In this paper, we introduce a novel temporal entailment pretraining objective for language models in the clinical domain. Our method formulates EHR segments as temporally ordered sentence pairs and trains the model to determine whether a later state is entailed by, contradictory to, or neutral with respect to an earlier state. Through this temporally structured pretraining task, models learn to perform latent clinical reasoning over time, improving their ability to generalize across forecasting and diagnosis tasks. We pretrain on a large corpus derived from MIMIC IV and demonstrate state of the art results on temporal clinical QA, early warning prediction, and disease progression modeling.

artificial intelligence, arxiv preprint arxiv, natural language, (16 more...)

arXiv.org Artificial Intelligence

2504.18128

Country: Asia > India (0.28)

Genre: Research Report (0.83)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.34)

Add feedback

Cross-Examiner: Evaluating Consistency of Large Language Model-Generated Explanations

Villa, Danielle, Chang, Maria, Murugesan, Keerthiram, Uceda-Sosa, Rosario, Ramamurthy, Karthikeyan Natesan

arXiv.org Artificial IntelligenceMar-11-2025

Large Language Models (LLMs) are often asked to explain their outputs to enhance accuracy and transparency. However, evidence suggests that these explanations can misrepresent the models' true reasoning processes. One effective way to identify inaccuracies or omissions in these explanations is through consistency checking, which typically involves asking follow-up questions. This paper introduces, cross-examiner, a new method for generating follow-up questions based on a model's explanation of an initial question. Our method combines symbolic information extraction with language model-driven question generation, resulting in better follow-up questions than those produced by LLMs alone. Additionally, this approach is more flexible than other methods and can generate a wider variety of follow-up questions.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.08815

Country: Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report (0.50)

Industry: Education (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

FactReasoner: A Probabilistic Approach to Long-Form Factuality Assessment for Large Language Models

Marinescu, Radu, Bhattacharjya, Debarun, Lee, Junkyu, Tchrakian, Tigran, Cano, Javier Carnerero, Hou, Yufang, Daly, Elizabeth, Pascale, Alessandra

arXiv.org Artificial IntelligenceFeb-25-2025

Large language models (LLMs) have demonstrated vast capabilities on generative tasks in recent years, yet they struggle with guaranteeing the factual correctness of the generated content. This makes these models unreliable in realistic situations where factually accurate responses are expected. In this paper, we propose FactReasoner, a new factuality assessor that relies on probabilistic reasoning to assess the factuality of a long-form generated response. Specifically, FactReasoner decomposes the response into atomic units, retrieves relevant contexts for them from an external knowledge source, and constructs a joint probability distribution over the atoms and contexts using probabilistic encodings of the logical relationships (entailment, contradiction) between the textual utterances corresponding to the atoms and contexts. FactReasoner then computes the posterior probability of whether atomic units in the response are supported by the retrieved contexts. Our experiments on labeled and unlabeled benchmark datasets demonstrate clearly that FactReasoner improves considerably over state-of-the-art prompt-based approaches in terms of both factual precision and recall.

atom, atomic unit, dataset, (16 more...)

arXiv.org Artificial Intelligence

2502.18573

Country:

South America > Brazil (0.28)
North America > United States > New Jersey (0.04)
North America > United States > New York (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (0.68)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback

NGQA: A Nutritional Graph Question Answering Benchmark for Personalized Health-aware Nutritional Reasoning

Zhang, Zheyuan, Li, Yiyang, Le, Nhi Ha Lan, Wang, Zehong, Ma, Tianyi, Galassi, Vincent, Murugesan, Keerthiram, Moniz, Nuno, Geyer, Werner, Chawla, Nitesh V, Zhang, Chuxu, Ye, Yanfang

arXiv.org Artificial IntelligenceDec-19-2024

Diet plays a critical role in human health, yet tailoring dietary reasoning to individual health conditions remains a major challenge. Nutrition Question Answering (QA) has emerged as a popular method for addressing this problem. However, current research faces two critical limitations. On one hand, the absence of datasets involving user-specific medical information severely limits \textit{personalization}. This challenge is further compounded by the wide variability in individual health needs. On the other hand, while large language models (LLMs), a popular solution for this task, demonstrate strong reasoning abilities, they struggle with the domain-specific complexities of personalized healthy dietary reasoning, and existing benchmarks fail to capture these challenges. To address these gaps, we introduce the Nutritional Graph Question Answering (NGQA) benchmark, the first graph question answering dataset designed for personalized nutritional health reasoning. NGQA leverages data from the National Health and Nutrition Examination Survey (NHANES) and the Food and Nutrient Database for Dietary Studies (FNDDS) to evaluate whether a food is healthy for a specific user, supported by explanations of the key contributing nutrients. The benchmark incorporates three question complexity settings and evaluates reasoning across three downstream tasks. Extensive experiments with LLM backbones and baseline models demonstrate that the NGQA benchmark effectively challenges existing models. In sum, NGQA addresses a critical real-world problem while advancing GraphQA research with a novel domain-specific benchmark.

large language model, machine learning, question answering, (21 more...)

arXiv.org Artificial Intelligence

2412.15547

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Therapeutic Area > Endocrinology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

PREDICT: Preference Reasoning by Evaluating Decomposed preferences Inferred from Candidate Trajectories

Aroca-Ouellette, Stephane, Mackraz, Natalie, Theobald, Barry-John, Metcalf, Katherine

arXiv.org Artificial IntelligenceOct-8-2024

Accommodating human preferences is essential for creating AI agents that deliver personalized and effective interactions. Recent work has shown the potential for LLMs to infer preferences from user interactions, but they often produce broad and generic preferences, failing to capture the unique and individualized nature of human preferences. This paper introduces PREDICT, a method designed to enhance the precision and adaptability of inferring preferences. PREDICT incorporates three key elements: (1) iterative refinement of inferred preferences, (2) decomposition of preferences into constituent components, and (3) validation of preferences across multiple trajectories. We evaluate PREDICT on two distinct environments: a gridworld setting and a new text-domain environment (PLUME).

agent, email, user example, (17 more...)

arXiv.org Artificial Intelligence

2410.06273

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
Europe > Germany (0.04)
North America > United States > New York (0.04)
(7 more...)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

Rethinking Semantic Parsing for Large Language Models: Enhancing LLM Performance with Semantic Hints

An, Kaikai, Si, Shuzheng, Hu, Helan, Zhao, Haozhe, Wang, Yuchi, Guo, Qingyan, Chang, Baobao

arXiv.org Artificial IntelligenceSep-22-2024

Semantic Parsing aims to capture the meaning of a sentence and convert it into a logical, structured form. Previous studies show that semantic parsing enhances the performance of smaller models (e.g., BERT) on downstream tasks. However, it remains unclear whether the improvements extend similarly to LLMs. In this paper, our empirical findings reveal that, unlike smaller models, directly adding semantic parsing results into LLMs reduces their performance. To overcome this, we propose SENSE, a novel prompting approach that embeds semantic hints within the prompt. Experiments show that SENSE consistently improves LLMs' performance across various tasks, highlighting the potential of integrating semantic information to improve LLM capabilities.

computational linguistic, entail, structure and semantic, (13 more...)

arXiv.org Artificial Intelligence

2409.14469

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback