Explanation & Argumentation
Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making
Zhang, Yunfeng, Liao, Q. Vera, Bellamy, Rachel K. E.
Today, AI is being increasingly used to help human experts make decisions in high-stakes scenarios. In these scenarios, full automation is often undesirable, not only due to the significance of the outcome, but also because human experts can draw on their domain knowledge complementary to the model's to ensure task success. We refer to these scenarios as AI-assisted decision making, where the individual strengths of the human and the AI come together to optimize the joint decision outcome. A key to their success is to appropriately \textit{calibrate} human trust in the AI on a case-by-case basis; knowing when to trust or distrust the AI allows the human expert to appropriately apply their knowledge, improving decision outcomes in cases where the model is likely to perform poorly. This research conducts a case study of AI-assisted decision making in which humans and AI have comparable performance alone, and explores whether features that reveal case-specific model information can calibrate trust and improve the joint performance of the human and AI. Specifically, we study the effect of showing confidence score and local explanation for a particular prediction. Through two human experiments, we show that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making, which may also depend on whether the human can bring in enough unique knowledge to complement the AI's errors. We also highlight the problems in using local explanation for AI-assisted decision making scenarios and invite the research community to explore new approaches to explainability for calibrating human trust in AI.
Google Cloud AI Explanations to increase fairness, responsibility, and trust Google Cloud Blog
June marked the first anniversary of Google's AI Principles, which formally outline our pledge to explore the potential of AI in a respectful, ethical and socially beneficial way. For Google Cloud, they also serve as an ongoing commitment to our customers--the tens of thousands of businesses worldwide who rely on Google Cloud AI every day--to deliver the transformative capabilities they need to thrive while aiming to help improve privacy, security, fairness, and the trust of their users. We strive to build AI aligned with our AI Principles and we're excited to introduce Explainable AI, which helps humans understand how a machine learning model reaches its conclusions. AI can unlock new ways to make businesses more efficient and create new opportunities to delight customers. That said, as with any new data-driven decision making tool, it can be a challenge to bring machine learning models into a business. Machine learning models can identify intricate correlations between enormous numbers of data points.
Data 2020 Outlook Part II: Explainable AI and Multi-model Databases ZDNet
In the year ahead, we see the cloud, AI, and data management as the megaforces of the data and analytics agenda. And so, picking up where Big on Data bro Andrew Brust left off last week, we're looking at some of the underlying issues that are shaping adoption. In the world of data and analytics, you can't start a conversation today without bringing in cloud and AI. Yesterday in Part I, we hit the cloud checkbox: we explored how the upcoming generation change in enterprise applications will in turn shift the context of how enterprises are going to be evaluating cloud deployment. Today we turn our attention to the core building block โ what's happening in databases, and what we expect to become the sleeper issue this year in AI.
Global Big Data Conference
In Part II of our year-ahead outlook, we explore the sleeper issues that will drive data management and the mainstreaming of AI in analytics. In the year ahead, we see the cloud, AI, and data management as the megaforces of the data and analytics agenda. And so, picking up where Big on Data bro Andrew Brust left off last week, we're looking at some of the underlying issues that are shaping adoption. In the world of data and analytics, you can't start a conversation today without bringing in cloud and AI. Yesterday in Part I, we hit the cloud checkbox: we explored how the upcoming generation change in enterprise applications will in turn shift the context of how enterprises are going to be evaluating cloud deployment.
If Nothing Is Accepted -- Repairing Argumentation Frameworks
Ulbricht, Markus (Leipzig University) | Baumann, Ringo
Conflicting information in an agent's knowledge base may lead to a semantical defect, that is, a situation where it is impossible to draw any plausible conclusion. Finding out the reasons for the observed inconsistency (so-called diagnoses) and/or restoring consistency in a certain minimal way (so-called repairs) are frequently occurring issues in knowledge representation and reasoning. In this article we provide a series of first results for these problems in the context of abstract argumentation theory regarding the two most important reasoning modes, namely credulous as well as sceptical acceptance. Our analysis includes the following problems regarding minimal repairs/diagnoses: existence, verification, computation of one and enumeration of all solutions. The latter problem is tackled with a version of the so-called hitting set duality first introduced by Raymond Reiter in 1987. It turns out that grounded semantics plays an outstanding role not only in terms of complexity, but also as a useful tool to reduce the search space for diagnoses regarding other semantics.
PySS3: A Python package implementing a novel text classifier with visualization tools for Explainable AI
Burdisso, Sergio G., Errecalde, Marcelo, Montes-y-Gรณmez, Manuel
A recently introduced text classifier, called SS3, has obtained state-of-the-art performance on the CLEF's eRisk tasks. SS3 was created to deal with risk detection over text streams and therefore not only supports incremental training and classification but also can visually explain its rationale. However, little attention has been paid to the potential use of SS3 as a general classifier. We believe this could be due to the unavailability of an open-source implementation of SS3. In this work, we introduce PySS3, a package that not only implements SS3 but also comes with visualization tools that allow researchers deploying robust, explainable and trusty machine learning models for text classification.
Measuring the Quality of Explanations: The System Causability Scale (SCS). Comparing Human and Machine Explanations
Holzinger, Andreas, Carrington, Andrรฉ, Mรผller, Heimo
Recent success in Artificial Intelligence (AI) and Machine Learning (ML) allow problem solving automatically without any human intervention. Autonomous approaches can be very convenient. However, in certain domains, e.g., in the medical domain, it is necessary to enable a domain expert to understand, why an algorithm came up with a certain result. Consequently, the field of Explainable AI (xAI) rapidly gained interest worldwide in various domains, particularly in medicine. Explainable AI studies transparency and traceability of opaque AI/ML and there are already a huge variety of methods. For example with layer-wise relevance propagation relevant parts of inputs to, and representations in, a neural network which caused a result, can be highlighted. This is a first important step to ensure that end users, e.g., medical professionals, assume responsibility for decision making with AI/ML and of interest to professionals and regulators. Interactive ML adds the component of human expertise to AI/ML processes by enabling them to re-enact and retrace AI/ML results, e.g. let them check it for plausibility. This requires new human-AI interfaces for explainable AI. In order to build effective and efficient interactive human-AI interfaces we have to deal with the question of how to evaluate the quality of explanations given by an explainable AI system. In this paper we introduce our System Causability Scale (SCS) to measure the quality of explanations. It is based on our notion of Causability (Holzinger et al., 2019) combined with concepts adapted from a widely accepted usability scale.