Möller, Torsten
MLMC: Interactive multi-label multi-classifier evaluation without confusion matrices
Doknic, Aleksandar, Möller, Torsten
Machine learning-based classifiers are commonly evaluated by metrics like accuracy, but deeper analysis is required to understand their strengths and weaknesses. MLMC is a visual exploration tool that tackles the challenge of multi-label classifier comparison and evaluation. It offers a scalable alternative to confusion matrices which are commonly used for such tasks, but don't scale well with a large number of classes or labels. Additionally, MLMC allows users to view classifier performance from an instance perspective, a label perspective, and a classifier perspective. Our user study shows that the techniques implemented by MLMC allow for a powerful multi-label classifier evaluation while preserving user friendliness.
Information That Matters: Exploring Information Needs of People Affected by Algorithmic Decisions
Schmude, Timothée, Koesten, Laura, Möller, Torsten, Tschiatschek, Sebastian
Explanations of AI systems rarely address the information needs of people affected by algorithmic decision-making (ADM). This gap between conveyed information and information that matters to affected stakeholders can impede understanding and adherence to regulatory frameworks such as the AI Act. To address this gap, we present the "XAI Novice Question Bank": A catalog of affected stakeholders' information needs in two ADM use cases (employment prediction and health monitoring), covering the categories data, system context, system usage, and system specifications. Information needs were gathered in an interview study where participants received explanations in response to their inquiries. Participants further reported their understanding and decision confidence, showing that while confidence tended to increase after receiving explanations, participants also met understanding challenges, such as being unable to tell why their understanding felt incomplete. Explanations further influenced participants' perceptions of the systems' risks and benefits, which they confirmed or changed depending on the use case. When risks were perceived as high, participants expressed particular interest in explanations about intention, such as why and to what end a system was put in place. With this work, we aim to support the inclusion of affected stakeholders into explainability by contributing an overview of information and challenges relevant to them when deciding on the adoption of ADM systems. We close by summarizing our findings in a list of six key implications that inform the design of future explanations for affected stakeholder audiences.
Applying Interdisciplinary Frameworks to Understand Algorithmic Decision-Making
Schmude, Timothée, Koesten, Laura, Möller, Torsten, Tschiatschek, Sebastian
Well-known examples of such "high-risk" [6] systems can be found in recidivism prediction [5], refugee resettlement [3], and public employment [19]. Many authors have outlined that faulty or biased predictions by ADM systems can have far-reaching consequences, including discrimination [5], inaccurate predictions [4], and overreliance on automated decisions [2]. Therefore, high-level guidelines are meant to prevent these issues by pointing out ways to develop trustworthy and ethical AI [10, 22]. However, practically applying these guidelines remains challenging, since the meaning and priority of ethical values shift depending on who is asked [11]. Recent work in Explainable Artificial Intelligence (XAI) thus suggests equipping individuals who are involved with an ADM system and carry responsibility--so-called "stakeholders"--with the means of assessing the system themselves, i.e. enabling users, deployers, and affected individuals to independently check the system's ethical values [14]. Arguably, a pronounced understanding of the system is necessary for making such an assessment. While numerous XAI studies have examined how explaining an ADM system can increase stakeholders' understanding [20, 21], we highlight two aspects that remain an open challenge: i) the amounts of resources needed to produce and test domain-specific explanations and ii) the difficulty of creating and evaluating understanding for a large variety of people. Further, it is important to note that, despite our reference to "Explainable AI," ADM is not constrained to AI, and indeed might encompass a broader problem space. Despite the emphasis on "understanding" in XAI research, the field features only a few studies that introduce learning frameworks from other disciplines.