Goto

Collaborating Authors

 Explanation & Argumentation


Explainable AI: XAI-Guided Context-Aware Data Augmentation

arXiv.org Artificial Intelligence

Explainable AI: XAI-Guided Context-A ware Data Augmentation Melkamu Abay Mersha a,, Mesay Gemeda Yigezu b, Atnafu Lambebo Tonja c, Hassan Shakil a, Samer Iskander a, Olga Kolesnikova b, Jugal Kalita a a College of Engineering and Applied Science, University of Colorado Colorado Springs, Colorado Springs, 80918, CO, USA b Instituto Polit ecnico Nacional (IPN), Centro de Investigaci on en Computaci on (CIC), 07738, Mexico City, Mexico c Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAEAbstract Explainable AI (XAI) has emerged as a powerful tool for improving the performance of AI models, going beyond providing model transparency and interpretability. The scarcity of labeled data remains a fundamental challenge in developing robust and gener-alizable AI models, particularly for low-resource languages. Conventional data augmentation techniques introduce noise, cause semantic drift, disrupt contextual coherence, lack control, and lead to overfitting. To address these challenges, we propose XAI-Guided Context-A ware Data Augmentation. This novel framework leverages XAI techniques to modify less critical features while selectively preserving most task-relevant features. Our approach integrates an iterative feedback loop, which refines augmented data over multiple augmentation cycles based on explainability-driven insights and the model performance gain. Our experimental results demonstrate that XAI-SR-BT and XAI-PR-BT improve the accuracy of models on hate speech and sentiment analysis tasks by 6.6% and 8.1%, respectively, compared to the baseline, using the Amharic dataset with the XLM-R model. XAI-SR-BT and XAI-PR-BT outperform existing augmentation techniques by 4.8% and 5%, respectively, on the same dataset and model. Overall, XAI-SR-BT and XAI-PR-BT consistently outperform both baseline and conventional augmentation techniques across all tasks and models. This study provides a more controlled, interpretable, and context-aware solution to data augmentation, addressing critical limitations of existing augmentation techniques and offering a new paradigm shift for leveraging XAI techniques to enhance AI model training. Introduction The rapid advancement of large language models (LLMs), such as GPT [1] and BERT [2], has transformed various domains, including safety-critical applications. Despite their impressive capabilities, these models operate as black boxes, raising concerns about transparency, trustworthiness, and in-terpretability. Explainable Artificial Intelligence (XAI) has emerged as a key solution to these concerns, offering insights into the decision-making processes of AI models.


Iterative Self-Improvement of Vision Language Models for Image Scoring and Self-Explanation

arXiv.org Artificial Intelligence

ABSTRACT Image scoring is a crucial task in numerous real-world applications. To trust a model's judgment, understanding its rationale is essential. This paper proposes a novel training method for Vision Language Models (VLMs) to generate not only image scores but also corresponding justifications in natural language. Leveraging only an image scoring dataset and an instruction-tuned VLM, our method enables self-training, utilizing the VLM's generated text without relying on external data or models. In addition, we introduce a simple method for creating a dataset designed to improve alignment between predicted scores and their textual justifications. By iteratively training the model with Direct Preference Optimization on two distinct datasets and merging them, we can improve both scoring accuracy and the coherence of generated explanations. Index T erms-- Vision language model, Explainable AI, Image scoring, Self-training, Direct Preference Optimization 1. INTRODUCTION Deep learning is revolutionizing image analysis, enabling automated classification and scoring with enhanced accuracy and efficiency. Examples include disease detection in medical images, defect identification in quality control, and predicting advertising effectiveness.


Choices and their Provenance: Explaining Stable Solutions of Abstract Argumentation Frameworks

arXiv.org Artificial Intelligence

The rule $\mathrm{Defeated}(x) \leftarrow \mathrm{Attacks}(y,x),\, \neg \, \mathrm{Defeated}(y)$, evaluated under the well-founded semantics (WFS), yields a unique 3-valued (skeptical) solution of an abstract argumentation framework (AF). An argument $x$ is defeated ($\mathrm{OUT}$) if there exists an undefeated argument $y$ that attacks it. For 2-valued (stable) solutions, this is the case iff $y$ is accepted ($\mathrm{IN}$), i.e., if all of $y$'s attackers are defeated. Under WFS, arguments that are neither accepted nor defeated are undecided ($\mathrm{UNDEC}$). As shown in prior work, well-founded solutions (a.k.a. grounded labelings) "explain themselves": The provenance of arguments is given by subgraphs (definable via regular path queries) rooted at the node of interest. This provenance is closely related to winning strategies of a two-player argumentation game. We present a novel approach for extending this provenance to stable AF solutions. Unlike grounded solutions, which can be constructed via a bottom-up alternating fixpoint procedure, stable models often involve non-deterministic choice as part of the search for models. Thus, the provenance of stable solutions is of a different nature, and reflects a more expressive generate & test paradigm. Our approach identifies minimal sets of critical attacks, pinpointing choices and assumptions made by a stable model. These critical attack edges provide additional insights into the provenance of an argument's status, combining well-founded derivation steps with choice steps. Our approach can be understood as a form of diagnosis that finds minimal "repairs" to an AF graph such that the well-founded solution of the repaired graph coincides with the desired stable model of the original AF graph.


Review for NeurIPS paper: Decisions, Counterfactual Explanations and Strategic Behavior

Neural Information Processing Systems

Weaknesses: The paper's biggest omission is that it only considers decision-maker utility as opposed to social welfare/decision subjects' utility. This is significant because the model and techniques proposed are inherently extractive in the following sense: the decision-maker can and will induce the subject to pay a cost of (say) .5 in order to improve the decision-maker's utility by .01. As noted in the paper, the hope is that the improvement is worth it to both the decision-maker and the subject, but there's no guarantee that this will actually be the case. I think the experiments should at least investigate this question: does social welfare ultimately increase? Are there individuals whose utility decreases compared to the non-strategic setting?


Review for NeurIPS paper: Decisions, Counterfactual Explanations and Strategic Behavior

Neural Information Processing Systems

This paper proposes and analyzes a model of strategic behavior under counterfactual explanations. In this model, a decision-maker chooses a policy and a small set of explanations that can be provided to decisions subjects who receive unfavorable decisions. In response, decision subjects follow the given explanation to improve their future outcomes. While doing so is NP Hard, the resulting formulation is shown to be submodular allowing for efficient approximations. This paper establishes an interesting connection between strategic behavior and explainability.


Combining Abstract Argumentation and Machine Learning for Efficiently Analyzing Low-Level Process Event Streams

arXiv.org Artificial Intelligence

Monitoring and analyzing process traces is a critical task for modern companies and organizations. In scenarios where there is a gap between trace events and reference business activities, this entails an interpretation problem, amounting to translating each event of any ongoing trace into the corresponding step of the activity instance. Building on a recent approach that frames the interpretation problem as an acceptance problem within an Abstract Argumentation Framework (AAF), one can elegantly analyze plausible event interpretations (possibly in an aggregated form), as well as offer explanations for those that conflict with prior process knowledge. Since, in settings where event-to-activity mapping is highly uncertain (or simply under-specified) this reasoning-based approach may yield lowly-informative results and heavy computation, one can think of discovering a sequence-tagging model, trained to suggest highly-probable candidate event interpretations in a context-aware way. However, training such a model optimally may require using a large amount of manually-annotated example traces. Considering the urgent need of developing Green AI solutions enabling environmental and societal sustainability (with reduced labor/computational costs and carbon footprint), we propose a data/computation-efficient neuro-symbolic approach to the problem, where the candidate interpretations returned by the example-driven sequence tagger is refined by the AAF-based reasoner. This allows us to also leverage prior knowledge to compensate for the scarcity of example data, as confirmed by experimental results; clearly, this property is particularly useful in settings where data annotation and model optimization costs are subject to stringent constraints.


Neuro-Symbolic Generation of Explanations for Robot Policies with Weighted Signal Temporal Logic

arXiv.org Artificial Intelligence

Neural network-based policies have demonstrated success in many robotic applications, but often lack human-explanability, which poses challenges in safety-critical deployments. To address this, we propose a neuro-symbolic explanation framework that generates a weighted signal temporal logic (wSTL) specification to describe a robot policy in a interpretable form. Existing methods typically produce explanations that are verbose and inconsistent, which hinders explainability, and loose, which do not give meaningful insights into the underlying policy. We address these issues by introducing a simplification process consisting of predicate filtering, regularization, and iterative pruning. We also introduce three novel explainability evaluation metrics -- conciseness, consistency, and strictness -- to assess explanation quality beyond conventional classification metrics. Our method is validated in three simulated robotic environments, where it outperforms baselines in generating concise, consistent, and strict wSTL explanations without sacrificing classification accuracy. This work bridges policy learning with formal methods, contributing to safer and more transparent decision-making in robotics.


B-XAIC Dataset: Benchmarking Explainable AI for Graph Neural Networks Using Chemical Data

arXiv.org Artificial Intelligence

Understanding the reasoning behind deep learning model predictions is crucial in cheminformatics and drug discovery, where molecular design determines their properties. However, current evaluation frameworks for Explainable AI (XAI) in this domain often rely on artificial datasets or simplified tasks, employing data-derived metrics that fail to capture the complexity of real-world scenarios and lack a direct link to explanation faithfulness. To address this, we introduce B-XAIC, a novel benchmark constructed from real-world molecular data and diverse tasks with known ground-truth rationales for assigned labels. Through a comprehensive evaluation using B-XAIC, we reveal limitations of existing XAI methods for Graph Neural Networks (GNNs) in the molecular domain. This benchmark provides a valuable resource for gaining deeper insights into the faithfulness of XAI, facilitating the development of more reliable and interpretable models.


BACON: A fully explainable AI model with graded logic for decision making problems

arXiv.org Artificial Intelligence

As machine learning models and autonomous agents are increasingly deployed in high-stakes, real-world domains such as healthcare, security, finance, and robotics, the need for transparent and trustworthy explanations has become critical. To ensure end-to-end transparency of AI decisions, we need models that are not only accurate but also fully explainable and human-tunable. We introduce BACON, a novel framework for automatically training explainable AI models for decision making problems using graded logic. BACON achieves high predictive accuracy while offering full structural transparency and precise, logic-based symbolic explanations, enabling effective human-AI collaboration and expert-guided refinement. We evaluate BACON with a diverse set of scenarios: classic Boolean approximation, Iris flower classification, house purchasing decisions and breast cancer diagnosis. In each case, BACON provides high-performance models while producing compact, human-verifiable decision logic. These results demonstrate BACON's potential as a practical and principled approach for delivering crisp, trustworthy explainable AI.


A Human-Centric Approach to Explainable AI for Personalized Education

arXiv.org Artificial Intelligence

Deep neural networks form the backbone of artificial intelligence research, with potential to transform the human experience in areas ranging from autonomous driving to personal assistants, healthcare to education. However, their integration into the daily routines of real-world classrooms remains limited. It is not yet common for a teacher to assign students individualized homework targeting their specific weaknesses, provide students with instant feedback, or simulate student responses to a new exam question. While these models excel in predictive performance, this lack of adoption can be attributed to a significant weakness: the lack of explainability of model decisions, leading to a lack of trust from students, parents, and teachers. This thesis aims to bring human needs to the forefront of eXplainable AI (XAI) research, grounded in the concrete use case of personalized learning and teaching. We frame the contributions along two verticals: technical advances in XAI and their aligned human studies. We investigate explainability in AI for education, revealing systematic disagreements between post-hoc explainers and identifying a need for inherently interpretable model architectures. We propose four novel technical contributions in interpretability with a multimodal modular architecture (MultiModN), an interpretable mixture-of-experts model (InterpretCC), adversarial training for explainer stability, and a theory-driven LLM-XAI framework to present explanations to students (iLLuMinaTE), which we evaluate in diverse settings with professors, teachers, learning scientists, and university students. By combining empirical evaluations of existing explainers with novel architectural designs and human studies, our work lays a foundation for human-centric AI systems that balance state-of-the-art performance with built-in transparency and trust.