Efficient Algorithms for Generating Provably Near-Optimal Cluster Descriptors for Explainability

Sambaturu, Prathyush, Gupta, Aparna, Davidson, Ian, Ravi, S. S., Vullikanti, Anil, Warren, Andrew

arXiv.org Artificial Intelligence 

As AI and machine learning (ML) methods become pervasive across all domains from health to urban planning, there is an increasing need to make the results of such methods more interpretable. Providing such explanations has now become a legal requirement in some countries [10]. Many researchers are investigating this topic under supervised learning, particularly for methods in deep learning (see e.g., [21, 22]). Clustering is a commonly used unsupervised ML technique (see e.g., [2, 3, 9, 27, 13, 31]). It is routinely performed on diverse kinds of datasets, sometimes after constructing network abstractions, and optimizing complex objective functions (e.g., modularity [2]). This can often make clusters hard to interpret especially in a post-hoc analysis. Thus, a natural question is whether it is possible to explain a given set of clusters, using additional attributes which, crucially, were not used in the clustering procedure. One motivation for our work is to understand the threat levels of pathogens for which genomic sequences are available.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found