AITopics | Explanation & Argumentation

Collaborating Authors

Explanation & Argumentation

News Overviews Instructional Materials AI-Alerts Classics

A Systematic Review of User-Centred Evaluation of Explainable AI in Healthcare

Donoso-Guzmán, Ivania, Kacafírková, Kristýna Sirka, Szymanski, Maxwell, Jacobs, An, Parra, Denis, Verbert, Katrien

arXiv.org Artificial IntelligenceJun-18-2025

Despite promising developments in Explainable Artificial Intelligence, the practical value of XAI methods remains under-explored and insufficiently validated in real-world settings. Robust and context-aware evaluation is essential, not only to produce understandable explanations but also to ensure their trustworthiness and usability for intended users, but tends to be overlooked because of no clear guidelines on how to design an evaluation with users. This study addresses this gap with two main goals: (1) to develop a framework of well-defined, atomic properties that characterise the user experience of XAI in healthcare; and (2) to provide clear, context-sensitive guidelines for defining evaluation strategies based on system characteristics. We conducted a systematic review of 82 user studies, sourced from five databases, all situated within healthcare settings and focused on evaluating AI-generated explanations. The analysis was guided by a predefined coding scheme informed by an existing evaluation framework, complemented by inductive codes developed iteratively. The review yields three key contributions: (1) a synthesis of current evaluation practices, highlighting a growing focus on human-centred approaches in healthcare XAI; (2) insights into the interrelations among explanation properties; and (3) an updated framework and a set of actionable guidelines to support interdisciplinary teams in designing and implementing effective evaluation strategies for XAI systems tailored to specific application contexts.

explanation, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2506.13904

Country:

North America > United States > New York > New York County > New York City (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)
(2 more...)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(4 more...)

Add feedback

Evaluating Explainability: A Framework for Systematic Assessment and Reporting of Explainable AI Features

Lago, Miguel A., Zamzmi, Ghada, Eich, Brandon, Delfino, Jana G.

arXiv.org Artificial IntelligenceJun-18-2025

Purpose: Explainability features are intended to provide insight into the internal mechanisms of an AI device, but there is a lack of evaluation techniques for assessing the quality of provided explanations. We propose a framework to assess and report explainable AI features. Materials and Methods: Our evaluation framework for AI explainability is based on four criteria: 1) Consistency quantifies the variability of explanations to similar inputs, 2) Plausibility estimates how close the explanation is to the ground truth, 3) Fidelity assesses the alignment between the explanation and the model internal mechanisms, and 4) Usefulness evaluates the impact on task performance of the explanation. Finally, we developed a scorecard for AI explainability methods that serves as a complete description and evaluation to accompany this type of algorithm. Results: We describe these four criteria and give examples on how they can be evaluated. As a case study, we use Ablation CAM and Eigen CAM to illustrate the evaluation of explanation heatmaps on the detection of breast lesions on synthetic mammographies. The first three criteria are evaluated for clinically-relevant scenarios. Conclusion: Our proposed framework establishes criteria through which the quality of explanations provided by AI models can be evaluated. We intend for our framework to spark a dialogue regarding the value provided by explainability features and help improve the development and evaluation of AI-based medical devices.

explanation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.13917

Country: North America > United States > Maryland > Montgomery County > Silver Spring (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.95)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(2 more...)

Add feedback

Cognitive Synergy Architecture: SEGO for Human-Centric Collaborative Robots

Oh, Jaehong

arXiv.org Artificial IntelligenceJun-17-2025

This paper presents SEGO (Semantic Graph Ontology), a cognitive mapping architecture designed to integrate geometric perception, semantic reasoning, and explanation generation into a unified framework for human-centric collaborative robotics. SEGO constructs dynamic cognitive scene graphs that represent not only the spatial configuration of the environment but also the semantic relations and ontological consistency among detected objects. The architecture seamlessly combines SLAM-based localization, deep-learning-based object detection and tracking, and ontology-driven reasoning to enable real-time, semantically coherent mapping.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.13149

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(3 more...)

Add feedback

Mind the XAI Gap: A Human-Centered LLM Framework for Democratizing Explainable AI

Paraschou, Eva, Arapakis, Ioannis, Yfantidou, Sofia, Macaluso, Sebastian, Vakali, Athena

arXiv.org Artificial IntelligenceJun-17-2025

Artificial Intelligence (AI) is rapidly embedded in critical decision-making systems, however their foundational ``black-box'' models require eXplainable AI (XAI) solutions to enhance transparency, which are mostly oriented to experts, making no sense to non-experts. Alarming evidence about AI's unprecedented human values risks brings forward the imperative need for transparent human-centered XAI solutions. In this work, we introduce a domain-, model-, explanation-agnostic, generalizable and reproducible framework that ensures both transparency and human-centered explanations tailored to the needs of both experts and non-experts. The framework leverages Large Language Models (LLMs) and employs in-context learning to convey domain- and explainability-relevant contextual knowledge into LLMs. Through its structured prompt and system setting, our framework encapsulates in one response explanations understandable by non-experts and technical information to experts, all grounded in domain and explainability principles. To demonstrate the effectiveness of our framework, we establish a ground-truth contextual ``thesaurus'' through a rigorous benchmarking with over 40 data, model, and XAI combinations for an explainable clustering analysis of a well-being scenario. Through a comprehensive quality and human-friendliness evaluation of our framework's explanations, we prove high content quality through strong correlations with ground-truth explanations (Spearman rank correlation=0.92) and improved interpretability and human-friendliness to non-experts through a user study (N=56). Our overall evaluation confirms trust in LLMs as HCXAI enablers, as our framework bridges the above Gaps by delivering (i) high-quality technical explanations aligned with foundational XAI methods and (ii) clear, efficient, and interpretable human-centered explanations for non-experts.

explanation, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.1224

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(2 more...)

Add feedback

Aspect-Based Opinion Summarization with Argumentation Schemes

Zhou, Wendi, Saadat-Yazdi, Ameer, Kokciyan, Nadin

arXiv.org Artificial IntelligenceJun-13-2025

Reviews are valuable resources for customers making purchase decisions in online shopping. However, it is impractical for customers to go over the vast number of reviews and manually conclude the prominent opinions, which prompts the need for automated opinion summarization systems. Previous approaches, either extractive or abstractive, face challenges in automatically producing grounded aspect-centric summaries. In this paper, we propose a novel summarization system that not only captures predominant opinions from an aspect perspective with supporting evidence, but also adapts to varying domains without relying on a pre-defined set of aspects. Our proposed framework, ASESUM, summarizes viewpoints relevant to the critical aspects of a product by extracting aspect-centric arguments and measuring their salience and validity. We conduct experiments on a real-world dataset to demonstrate the superiority of our approach in capturing diverse perspectives of the original reviews compared to new and existing methods.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.09917

Country:

Asia (0.46)
Europe > Spain (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.42)

Add feedback

Interview with Amar Halilovic: Explainable AI for robotics

AIHubJun-10-2025, 07:50:15 GMT

In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In this latest interview, we hear from Amar Halilovic, a PhD student at Ulm University. I'm currently a PhD student at Ulm University in Germany, where I focus on explainable AI for robotics. My research investigates how robots can generate explanations of their actions in a way that aligns with human preferences and expectations, particularly in navigation tasks.

artificial intelligence, explanation, natural language, (6 more...)

AIHub

Country: Europe > Germany (0.27)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.63)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.63)

Add feedback

VARSHAP: Addressing Global Dependency Problems in Explainable AI with Variance-Based Local Feature Attribution

Gajewski, Mateusz, Morzy, Mikołaj, Karczmarz, Adam, Sankowski, Piotr

arXiv.org Artificial IntelligenceJun-10-2025

Existing feature attribution methods like SHAP often suffer from global dependence, failing to capture true local model behavior. This paper introduces VARSHAP, a novel model-agnostic local feature attribution method which uses the reduction of prediction variance as the key importance metric of features. Building upon Shapley value framework, VARSHAP satisfies the key Shapley axioms, but, unlike SHAP, is resilient to global data distribution shifts. Experiments on synthetic and real-world datasets demonstrate that VARSHAP outperforms popular methods such as KernelSHAP or LIME, both quantitatively and qualitatively.

attribution, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.07229

Country: Europe > Poland (0.29)

Genre: Research Report (1.00)

Industry:

Government (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science > Data Mining (0.68)
(3 more...)

Add feedback

ExplainBench: A Benchmark Framework for Local Model Explanations in Fairness-Critical Applications

Afful, James

arXiv.org Artificial IntelligenceJun-10-2025

As machine learning systems are increasingly deployed in high-stakes domains such as criminal justice, finance, and healthcare, the demand for interpretable and trustworthy models has intensified. Despite the proliferation of local explanation techniques, including SHAP, LIME, and counterfactual methods, there exists no standardized, reproducible framework for their comparative evaluation, particularly in fairness-sensitive settings. We introduce ExplainBench, an open-source benchmarking suite for systematic evaluation of local model explanations across ethically consequential datasets. ExplainBench provides unified wrappers for popular explanation algorithms, integrates end-to-end pipelines for model training and explanation generation, and supports evaluation via fidelity, sparsity, and robustness metrics. The framework includes a Streamlit-based graphical interface for interactive exploration and is packaged as a Python module for seamless integration into research workflows. We demonstrate ExplainBench on datasets commonly used in fairness research, such as COMPAS, UCI Adult Income, and LendingClub, and showcase how different explanation methods behave under a shared experimental protocol. By enabling reproducible, comparative analysis of local explanations, ExplainBench advances the methodological foundations of interpretable machine learning and facilitates accountability in real-world AI systems.

explanation, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.0633

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry:

Law (0.49)
Health & Medicine (0.49)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Stable Vision Concept Transformers for Medical Diagnosis

Hu, Lijie, Lai, Songning, Hua, Yuan, Yang, Shu, Zhang, Jingfeng, Wang, Di

arXiv.org Artificial IntelligenceJun-6-2025

Transparency is a paramount concern in the medical field, prompting researchers to delve into the realm of explainable AI (XAI). Among these XAI methods, Concept Bottleneck Models (CBMs) aim to restrict the model's latent space to human-understandable high-level concepts by generating a conceptual layer for extracting conceptual features, which has drawn much attention recently. However, existing methods rely solely on concept features to determine the model's predictions, which overlook the intrinsic feature embeddings within medical images. To address this utility gap between the original models and concept-based models, we propose Vision Concept Transformer (VCT). Furthermore, despite their benefits, CBMs have been found to negatively impact model performance and fail to provide stable explanations when faced with input perturbations, which limits their application in the medical field. To address this faithfulness issue, this paper further proposes the Stable Vision Concept Transformer (SVCT) based on VCT, which leverages the vision transformer (ViT) as its backbone and incorporates a conceptual layer. SVCT employs conceptual features to enhance decision-making capabilities by fusing them with image features and ensures model faithfulness through the integration of Denoised Diffusion Smoothing. Comprehensive experiments on four medical datasets demonstrate that our VCT and SVCT maintain accuracy while remaining interpretable compared to baselines. Furthermore, even when subjected to perturbations, our SVCT model consistently provides faithful explanations, thus meeting the needs of the medical field.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.05286

Country: Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

Causal Explanations Over Time: Articulated Reasoning for Interactive Environments

Rödling, Sebastian, Zečević, Matej, Dhami, Devendra Singh, Kersting, Kristian

arXiv.org Artificial IntelligenceJun-5-2025

Structural Causal Explanations (SCEs) can be used to automatically generate explanations in natural language to questions about given data that are grounded in a (possibly learned) causal model. Unfortunately they work for small data only. In turn they are not attractive to offer reasons for events, e.g., tracking causal changes over multiple time steps, or a behavioral component that involves feedback loops through actions of an agent. To this end, we generalize SCEs to a (recursive) formulation of explanation trees to capture the temporal interactions between reasons. We show the benefits of this more general SCE algorithm on synthetic time-series data and a 2D grid game, and further compare it to the base SCE and other existing methods for causal explanations.

explanation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.03915

Country: Europe (0.93)

Genre: Research Report (1.00)

Industry: Health & Medicine > Consumer Health (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback