AITopics | supported

Collaborating Authors

supported

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

10 Breakthrough Technologies 2026

MIT Technology ReviewJan-12-2026, 11:15:00 GMT

Our reporters and editors constantly debate which emerging technologies will define the future. Once a year, we take stock and share some educated guesses with our readers. Here are the advances that we think will drive progress or incite the most change--for better or worse--in the years ahead. Rubrik is the exclusive sponsor of the 10 Breakthrough Technologies 2026 and had no editorial influence on this list. Rubrik is a security and AI operations company that aims to secure and accelerate the world's AI transformation.

breakthrough technology 2026, mit technology review, supported, (7 more...)

MIT Technology Review

Country: North America > United States > Massachusetts (0.05)

Genre: Research Report > Promising Solution (0.64)

Industry:

Energy (0.99)
Health & Medicine > Therapeutic Area (0.49)
Health & Medicine > Pharmaceuticals & Biotechnology (0.33)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.74)
Information Technology > Communications > Social Media (0.51)

Add feedback

SemanticCite: Citation Verification with AI-Powered Full-Text Analysis and Evidence-Based Reasoning

Haan, Sebastian

arXiv.org Artificial IntelligenceNov-21-2025

Effective scientific communication depends on accurate citations that validate sources and guide readers to supporting evidence. Yet academic literature faces mounting challenges: semantic citation errors that misrepresent sources, AI-generated hallucinated references, and traditional citation formats that point to entire papers without indicating which sections substantiate specific claims. We introduce SemanticCite, an AI-powered system that verifies citation accuracy through full-text source analysis while providing rich contextual information via detailed reasoning and relevant text snippets. Our approach combines multiple retrieval methods with a four-class classification system (Supported, Partially Supported, Unsupported, Uncertain) that captures nuanced claim-source relationships and enables appropriate remedial actions for different error types. Our experiments show that fine-tuned lightweight language models achieve performance comparable to large commercial systems with significantly lower computational requirements, making large-scale citation verification practically feasible. The system provides transparent, evidence-based explanations that support user understanding and trust. We contribute a comprehensive dataset of over 1,000 citations with detailed alignments, functional classifications, semantic annotations, and bibliometric metadata across eight disciplines, alongside fine-tuned models and the complete verification framework as open-source software. SemanticCite addresses critical challenges in research integrity through scalable citation verification, streamlined peer review, and quality control for AI-generated content, providing an open-source foundation for maintaining citation accuracy at scale.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2511.16198

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)

Add feedback

Hybrid Fact-Checking that Integrates Knowledge Graphs, Large Language Models, and Search-Based Retrieval Agents Improves Interpretable Claim Verification

Kolli, Shaghayegh, Rosenbaum, Richard, Cavelius, Timo, Strothe, Lasse, Lata, Andrii, Diesner, Jana

arXiv.org Artificial IntelligenceNov-6-2025

Large language models (LLMs) excel in generating fluent utterances but can lack reliable grounding in verified information. At the same time, knowledge-graph-based fact-checkers deliver precise and interpretable evidence, yet suffer from limited coverage or latency. By integrating LLMs with knowledge graphs and real-time search agents, we introduce a hybrid fact-checking approach that leverages the individual strengths of each component. Our system comprises three autonomous steps: 1) a Knowledge Graph (KG) Retrieval for rapid one-hop lookups in DBpedia, 2) an LM-based classification guided by a task-specific labeling prompt, producing outputs with internal rule-based logic, and 3) a Web Search Agent invoked only when KG coverage is insufficient. Our pipeline achieves an F1 score of 0.93 on the FEVER benchmark on the Supported/Refuted split without task-specific fine-tuning. To address Not enough information cases, we conduct a targeted reannotation study showing that our approach frequently uncovers valid evidence for claims originally labeled as Not Enough Information (NEI), as confirmed by both expert annotators and LLM reviewers. With this paper, we present a modular, opensource fact-checking pipeline with fallback strategies and generalization across datasets.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.03217

Country:

North America > United States (1.00)
North America > Mexico > Mexico City (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Federated Data Analytics for Cancer Immunotherapy: A Privacy-Preserving Collaborative Platform for Patient Management

Raheem, Mira, Papazoglou, Michael, Krämer, Bernd, El-Tazi, Neamat, Elgammal, Amal

arXiv.org Artificial IntelligenceOct-13-2025

Connected health is a multidisciplinary approach focused on health management, prioritizing pa-tient needs in the creation of tools, services, and treatments. This paradigm ensures proactive and efficient care by facilitating the timely exchange of accurate patient information among all stake-holders in the care continuum. The rise of digital technologies and process innovations promises to enhance connected health by integrating various healthcare data sources. This integration aims to personalize care, predict health outcomes, and streamline patient management, though challeng-es remain, particularly in data architecture, application interoperability, and security. Data analytics can provide critical insights for informed decision-making and health co-creation, but solutions must prioritize end-users, including patients and healthcare professionals. This perspective was explored through an agile System Development Lifecycle in an EU-funded project aimed at developing an integrated AI-generated solution for managing cancer patients undergoing immunotherapy. This paper contributes with a collaborative digital framework integrating stakeholders across the care continuum, leveraging federated big data analytics and artificial intelligence for improved decision-making while ensuring privacy. Analytical capabilities, such as treatment recommendations and adverse event predictions, were validated using real-life data, achieving 70%-90% accuracy in a pilot study with the medical partners, demonstrating the framework's effectiveness.

artificial intelligence, big data, data mining, (20 more...)

arXiv.org Artificial Intelligence

2510.09155

Country:

Europe (1.00)
Oceania > Australia (0.28)
North America > United States (0.28)
Africa > Middle East > Egypt (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Communications > Web > Semantic Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition

Dipta, Shubhashis Roy, Ferraro, Francis

arXiv.org Artificial IntelligenceSep-30-2025

Prior work has shown that presupposition in generated questions can introduce unverified assumptions, leading to inconsistencies in claim verification. Additionally, prompt sensitivity remains a significant challenge for large language models (LLMs), resulting in performance variance as high as 3-6%. While recent advancements have reduced this gap, our study demonstrates that prompt sensitivity remains a persistent issue. To address this, we propose a structured and robust claim verification framework that reasons through presupposition-free, decomposed questions. Extensive experiments across multiple prompts, datasets, and LLMs reveal that even state-of-the-art models remain susceptible to prompt variance and presupposition. Our method consistently mitigates these issues, achieving up to a 2-5% improvement.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.16838

Country:

Asia (0.68)
North America > United States > Maryland (0.67)

Genre: Research Report > New Finding (1.00)

Industry:

Media (0.68)
Leisure & Entertainment (0.68)
Government (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

VeriTrail: Closed-Domain Hallucination Detection with Traceability

Metropolitansky, Dasha, Larson, Jonathan

arXiv.org Artificial IntelligenceMay-29-2025

Even when instructed to adhere to source material, Language Models often generate unsubstantiated content - a phenomenon known as "closed-domain hallucination." This risk is amplified in processes with multiple generative steps (MGS), compared to processes with a single generative step (SGS). However, due to the greater complexity of MGS processes, we argue that detecting hallucinations in their final outputs is necessary but not sufficient: it is equally important to trace where hallucinated content was likely introduced and how faithful content may have been derived from the source through intermediate outputs. To address this need, we present VeriTrail, the first closed-domain hallucination detection method designed to provide traceability for both MGS and SGS processes. We also introduce the first datasets to include all intermediate outputs as well as human annotations of final outputs' faithfulness for their respective MGS processes. We demonstrate that VeriTrail outperforms baseline methods on both datasets.

eritrail, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2505.21786

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.93)

Genre: Research Report > New Finding (1.00)

Industry:

Law (0.67)
Energy > Renewable (0.46)
Government > Regional Government (0.46)
Information Technology > Services (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

All You Need is Sally-Anne: ToM in AI Strongly Supported After Surpassing Tests for 3-Year-Olds

Alon, Nitay, Barnby, Joseph, Mirsky, Reuth, Sarkadi, Stefan

arXiv.org Artificial IntelligenceMar-31-2025

Theory of Mind (ToM) is a hallmark of human cognition, allowing individuals to reason about others' beliefs and intentions. Engineers behind recent advances in Artificial Intelligence (AI) have claimed to demonstrate comparable capabilities. This paper presents a model that surpasses traditional ToM tests designed for 3-year-old children, providing strong support for the presence of ToM in AI systems.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2503.24215

Country:

Oceania > Australia > Western Australia (0.05)
Europe > United Kingdom > England > Greater London > London (0.05)
North America > United States (0.04)
(2 more...)

Genre: Research Report (0.83)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.47)

Add feedback

MEMERAG: A Multilingual End-to-End Meta-Evaluation Benchmark for Retrieval Augmented Generation

Blandón, María Andrea Cruz, Talur, Jayasimha, Charron, Bruno, Liu, Dong, Mansour, Saab, Federico, Marcello

arXiv.org Artificial IntelligenceFeb-25-2025

Automatic evaluation of retrieval augmented generation (RAG) systems relies on fine-grained dimensions like faithfulness and relevance, as judged by expert human annotators. Meta-evaluation benchmarks support the development of automatic evaluators that correlate well with human judgement. However, existing benchmarks predominantly focus on English or use translated data, which fails to capture cultural nuances. A native approach provides a better representation of the end user experience. In this work, we develop a Multilingual End-to-end Meta-Evaluation RAG benchmark (MEMERAG). Our benchmark builds on the popular MIRACL dataset, using native-language questions and generating responses with diverse large language models (LLMs), which are then assessed by expert annotators for faithfulness and relevance. We describe our annotation process and show that it achieves high inter-annotator agreement. We then analyse the performance of the answer-generating LLMs across languages as per the human evaluators. Finally we apply the dataset to our main use-case which is to benchmark multilingual automatic evaluators (LLM-as-a-judge). We show that our benchmark can reliably identify improvements offered by advanced prompting techniques and LLMs. We will release our benchmark to support the community developing accurate evaluation methods for multilingual RAG systems.

benchmark, dataset, supported, (16 more...)

arXiv.org Artificial Intelligence

2502.17163

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Spain (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

STRIVE: Structured Reasoning for Self-Improvement in Claim Verification

Gong, Haisong, Li, Jing, Wu, Junfei, Liu, Qiang, Wu, Shu, Wang, Liang

arXiv.org Artificial IntelligenceFeb-17-2025

Claim verification is the task of determining whether a claim is supported or refuted by evidence. Self-improvement methods, where reasoning chains are generated and those leading to correct results are selected for training, have succeeded in tasks like mathematical problem solving. However, in claim verification, this approach struggles. Low-quality reasoning chains may falsely match binary truth labels, introducing faulty reasoning into the self-improvement process and ultimately degrading performance. To address this, we propose STRIVE: Structured Reasoning for Self-Improved Verification. Our method introduces a structured reasoning design with Claim Decomposition, Entity Analysis, and Evidence Grounding Verification. These components improve reasoning quality, reduce errors, and provide additional supervision signals for self-improvement. STRIVE begins with a warm-up phase, where the base model is fine-tuned on a small number of annotated examples to learn the structured reasoning design. It is then applied to generate reasoning chains for all training examples, selecting only those that are correct and structurally sound for subsequent self-improvement training. We demonstrate that STRIVE achieves significant improvements over baseline models, with a 31.4% performance gain over the base model and 20.7% over Chain of Thought on the HOVER datasets, highlighting its effectiveness.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.11959

Country:

Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Sports (0.95)
Media (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)

Add feedback

VeriFact: Verifying Facts in LLM-Generated Clinical Text with Electronic Health Records

Chung, Philip, Swaminathan, Akshay, Goodell, Alex J., Kim, Yeasul, Reincke, S. Momsen, Han, Lichy, Deverett, Ben, Sadeghi, Mohammad Amin, Ariss, Abdel-Badih, Ghanem, Marc, Seong, David, Lee, Andrew A., Coombes, Caitlin E., Bradshaw, Brad, Sufian, Mahir A., Hong, Hyo Jung, Nguyen, Teresa P., Rasouli, Mohammad R., Kamra, Komal, Burbridge, Mark A., McAvoy, James C., Saffary, Roya, Ma, Stephen P., Dash, Dev, Xie, James, Wang, Ellen Y., Schmiesing, Clifford A., Shah, Nigam, Aghaeepour, Nima

arXiv.org Artificial IntelligenceJan-27-2025

Methods to ensure factual accuracy of text generated by large language models (LLM) in clinical medicine are lacking. VeriFact is an artificial intelligence system that combines retrieval-augmented generation and LLM-as-a-Judge to verify whether LLM-generated text is factually supported by a patient's medical history based on their electronic health record (EHR). To evaluate this system, we introduce VeriFact-BHC, a new dataset that decomposes Brief Hospital Course narratives from discharge summaries into a set of simple statements with clinician annotations for whether each statement is supported by the patient's EHR clinical notes. Whereas highest agreement between clinicians was 88.5%, VeriFact achieves up to 92.7% agreement when compared to a denoised and adjudicated average human clinician ground truth, suggesting that VeriFact exceeds the average clinician's ability to fact-check text against a patient's medical record. VeriFact may accelerate the development of LLM-based EHR applications by removing current evaluation bottlenecks.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2501.16672

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback