Understanding partition comparison indices based on counting object pairs

Warrens, Matthijs J., van der Hoef, Hanneke

arXiv.org Machine Learning 

For example, in unsupervised machine learning, to evaluate theperformance of a clustering method, researchers typically assess agreement between a reference standard partition that purports to represent the true cluster structure of the objects (golden standard), and a trial partition produced by the method that is being evaluated (Wallace 1983; Halkidi, Batiskis and Vazirgiannis 2002; Jain 2010). High agreement between the two partitions may indicate good recovery of the true cluster structure. Agreement between partitions can be assessed with so-called external validity indices (Albatineh, Niewiadomska-Bugaj and Mihalko 2006; Brun et al. 2007; Warrens 2008a,2008b; Pfitzner et al. 2009). External validity indices can be roughly categorized into three approaches, namely 1) counting object pairs, 2) information theory (Vinh, Epps and Bailey 2010; Lei et al. 2016), and 3) matching sets (Rezaei and Fränti 2016). Most external validity indices are of the pair-counting approach, which is based on counting pairs of objects placed in identical and different clusters.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found