The Cluster Description Problem - Complexity Results, Formulations and Approximations

Davidson, Ian, Gourru, Antoine, Ravi, S

Neural Information Processing Systems 

Consider the situation where you are given an existing $k$-way clustering $\pi$. A challenge for explainable AI is to find a compact and distinct explanations of each cluster which in this paper is using instance-level descriptors/tags from a common dictionary. Since the descriptors/tags were not given to the clustering method, this is not a semi-supervised learning situation. We show that the \emph{feasibility} problem of just testing whether any distinct description (not the most compact) exists is generally intractable for just two clusters. This means that unless \textbf{P} \cnp, there cannot exist an efficient algorithm for the cluster description problem.