icvis
iCVI-ARTMAP: Accelerating and improving clustering using adaptive resonance theory predictive mapping and incremental cluster validity indices
da Silva, Leonardo Enzo Brito, Rayapati, Nagasharath, Wunsch, Donald C. II
This paper presents an adaptive resonance theory predictive mapping (ARTMAP) model which uses incremental cluster validity indices (iCVIs) to perform unsupervised learning, namely iCVI-ARTMAP. Incorporating iCVIs to the decision-making and many-to-one mapping capabilities of ARTMAP can improve the choices of clusters to which samples are incrementally assigned. These improvements are accomplished by intelligently performing the operations of swapping sample assignments between clusters, splitting and merging clusters, and caching the values of variables when iCVI values need to be recomputed. Using recursive formulations enables iCVI-ARTMAP to considerably reduce the computational burden associated with cluster validity index (CVI)-based offline clustering. Depending on the iCVI and the data set, it can achieve running times up to two orders of magnitude shorter than when using batch CVI computations. In this work, the incremental versions of Calinski-Harabasz, WB-index, Xie-Beni, Davies-Bouldin, Pakhira-Bandyopadhyay-Maulik, and negentropy increment were integrated into fuzzy ARTMAP. Experimental results show that, with proper choice of iCVI, iCVI-ARTMAP outperformed fuzzy adaptive resonance theory (ART), dual vigilance fuzzy ART, kmeans, spectral clustering, Gaussian mixture models and hierarchical agglomerative clustering algorithms in most of the synthetic benchmark data sets. It also performed competitively on real world image benchmark data sets when clustering on projections and on latent spaces generated by a deep clustering model. Naturally, the performance of iCVI-ARTMAP is subject to the selected iCVI and its suitability to the data at hand; fortunately, it is a general model wherein other iCVIs can be easily embedded.
Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative Study
da Silva, Leonardo Enzo Brito, Melton, Niklas M., Wunsch, Donald C. II
V alidation is one of the most important aspects of clustering, but most approaches have been batch methods. Recently, interest has grown in providing incremental alternatives. This paper extends the incremental cluster validity index (iCVI) family to include incremental versions of Calinski-Harabasz (iCH), I index and Pakhira-Bandyopadhyay-Maulik (iI and iPBM), Silhouette (iSIL), Negentropy Increment (iNI), Representative Cross Information Potential (irCIP) and Representative Cross Entropy (irH), and Conn Index (iConn Index). Additionally, the effect of under-and over-partitioning on the behavior of these six iCVIs, the Partition Separation (PS) index, as well as two other recently developed iCVIs (incremental Xie-Beni (iXB) and incremental Davies-Bouldin (iDB)) was examined through a comparative study. Experimental results using fuzzy adaptive resonance theory (ART)-based clustering methods showed that while evidence of most under-partitioning cases could be inferred from the behaviors of all these iCVIs, over-partitioning was found to be a more challenging scenario indicated only by the iConn Index. The expansion of incremental validity indices provides significant novel opportunities for assessing and interpreting the results of unsupervised learning. L. E. Brito da Silva is with the Applied Computational Intelligence Laboratory, Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409 USA, and also with the CAPES Foundation, Ministry of Education of Brazil, Bras ılia, DF 70040-020, Brazil (email: leonardoenzo@ieee.org). N. M. Melton is with the Applied Computational Intelligence Laboratory, Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409 USA (email: niklasmelton@ieee.org). D. C. Wunsch II is with the Applied Computational Intelligence Laboratory, Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409 USA (email: wunsch@ieee.org). I NTRODUCTION Cluster validation [1] is a critical topic in cluster analysis.
Online Cluster Validity Indices for Streaming Data
Moshtaghi, Masud, Bezdek, James C., Erfani, Sarah M., Leckie, Christopher, Bailey, James
Cluster analysis is used to explore structure in unlabeled data sets in a wide range of applications. An important part of cluster analysis is validating the quality of computationally obtained clusters. A large number of different internal indices have been developed for validation in the offline setting. However, this concept has not been extended to the online setting. A key challenge is to find an efficient incremental formulation of an index that can capture both cohesion and separation of the clusters over potentially infinite data streams. In this paper, we develop two online versions (with and without forgetting factors) of the Xie-Beni and Davies-Bouldin internal validity indices, and analyze their characteristics, using two streaming clustering algorithms (sk-means and online ellipsoidal clustering), and illustrate their use in monitoring evolving clusters in streaming data. We also show that incremental cluster validity indices are capable of sending a distress signal to online monitors when evolving clusters go awry. Our numerical examples indicate that the incremental Xie-Beni index with forgetting factor is superior to the other three indices tested.