An Explainable AI System for the Diagnosis of High Dimensional Biomedical Data

Ultsch, Alfred, Hoffmann, Jörg, Röhnert, Maximilian, Von Bonin, Malte, Oelschlägel, Uta, Brendel, Cornelia, Thrun, Michael C.

arXiv.org Artificial Intelligence 

ABSTRACT Typical state of the art flow cytometry data samples consists of measures of more than 100.000 cells in 10 or more features. AI systems are able to diagnose such data with almost the same accuracy as human experts. However, there is one central challenge in such systems: their decisions have far-reaching consequences for the health and life of people, and therefore, the decisions of AI systems need to be understandable and justifiable by humans. In this work, we present a novel explainable AI method, called ALPODS, which is able to classify (diagnose) cases based on clusters, i.e., subpopulations, in the high-dimensional data. ALPODS is able to explain its decisions in a form that is understandable for human experts. For the identified subpopulations, fuzzy reasoning rules expressed in the typical language of domain experts are generated. A visualization method based on these rules allows human experts to understand the reasoning used by the AI system. A comparison to a selection of state of the art explainable AI systems shows that ALPODS operates efficiently on known benchmark data and also on everyday routine case data. KEYWORDS: Explainable AI, Expert System, Symbolic System, Biomedical Data 1. INTRODUCTION State of the art machine learning (ML) artificial intelligence (AI) algorithms are effectively and efficiently able to diagnose (classify) high-dimensional data sets in modern medicine, e.g., for multiparameter flow cytometry data [Hu et al., 2019; Zhao et al., 2020]. These are systems that, after a training (learning) phase using learning data, perform well on data that are not part of the training data, i.e., the test data. This is called supervised learning [Murphy, 2012].