A close-up comparison of the misclassification error distance and the adjusted Rand index for external clustering evaluation

Chacón, José E.

arXiv.org Machine Learning 

Indeed, it was the recommended choice in the seminal paper of Milligan and Cooper (1986), where five criteria were examined regarding the task of comparison of hierarchical clustering algorithms across different hierarchy levels. Their recommendation is based on the fact that, for the null case data (i.e., for a synthetic sample with randomly assigned class labels, showing no significant cluster structure), the ARI was the only index that produced a flat response curve across hierarchy levels, with mean values close to zero, hence indicating that the agreement between the randomly assigned labels and the algorithm solution was due to chance. Another popular measure for clustering validation, not included in Milligan and Cooper's study, is the misclassification error distance (MED). Its first appearance in the literature dates back at least to R egnier (1965), where it was introduced as a distance between partitions of a finite set, and it was called transfer distance. It is also referred to as partition distance (Gusfield, 2002) or maximum matching distance (Rossi, 2015).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found