spdc
Algorithm-Agnostic Interpretations for Clustering
Scholbeck, Christian A., Funk, Henri, Casalicchio, Giuseppe
A clustering outcome for high-dimensional data is typically interpreted via post-processing, involving dimension reduction and subsequent visualization. This destroys the meaning of the data and obfuscates interpretations. We propose algorithm-agnostic interpretation methods to explain clustering outcomes in reduced dimensions while preserving the integrity of the data. The permutation feature importance for clustering represents a general framework based on shuffling feature values and measuring changes in cluster assignments through custom score functions. The individual conditional expectation for clustering indicates observation-wise changes in the cluster assignment due to changes in the data. The partial dependence for clustering evaluates average changes in cluster assignments for the entire feature space. All methods can be used with any clustering algorithm able to reassign instances through soft or hard labels. In contrast to common post-processing methods such as principal component analysis, the introduced methods maintain the original structure of the features.
Stochastic Primal Dual Coordinate Method with Non-Uniform Sampling Based on Optimality Violations
Shibagaki, Atsushi, Takeuchi, Ichiro
We study primal-dual type stochastic optimization algorithms with non-uniform sampling. Our main theoretical contribution in this paper is to present a convergence analysis of Stochastic Primal Dual Coordinate (SPDC) Method with arbitrary sampling. Based on this theoretical framework, we propose Optimality Violation-based Sampling SPDC (ovsSPDC), a non-uniform sampling method based on Optimality Violation. We also propose two efficient heuristic variants of ovsSPDC called ovsSDPC+ and ovsSDPC++. Through intensive numerical experiments, we demonstrate that the proposed method and its variants are faster than other state-of-the-art primal-dual type stochastic optimization methods.