Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity
Duran, C., Acevedo, A., Ciucci, S., Muscoloni, A., Cannistraci, CV.
The development of algorithms for unsupervised pattern recognition by nonlinear clustering is a notable problem in data science. Markov clustering (MCL) is a renowned algorithm that simulates stochastic flows on a network of sample similarities to detect the structural organization of clusters in the data, but it has never been generalized to deal with data nonlinearity. Minimum Curvilinearity (MC) is a principle that approximates nonlinear sample distances in the high-dimensional feature space by curvilinear distances, which are computed as transversal paths over their minimum spanning tree, and then stored in a kernel. Here we propose MC-MCL, which is the first nonlinear kernel extension of MCL and exploits Minimum Curvilinearity to enhance the performance of MCL in real and synthetic data with underlying nonlinear patterns. MC-MCL is compared with baseline clustering methods, including DBSCAN, K-means and affinity propagation. We find that Minimum Curvilinearity provides a valuable framework to estimate nonlinear distances also when its kernel is applied in combination with MCL. Indeed, MC-MCL overcomes classical MCL and even baseline clustering algorithms in different nonlinear datasets.
Dec-27-2019
- Country:
- South America > Chile
- Maule Region > Talca Province > Talca (0.04)
- North America > United States
- California > San Diego County > San Diego (0.04)
- Europe
- Italy
- Sicily (0.04)
- Trentino-Alto Adige/Südtirol > Trentino Province
- Trento (0.04)
- Emilia-Romagna > Metropolitan City of Bologna
- Bologna (0.04)
- Germany > Saxony
- Dresden (0.04)
- Italy
- Asia
- Middle East > Saudi Arabia (0.04)
- China > Beijing
- Beijing (0.04)
- South America > Chile
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Health & Medicine > Therapeutic Area (1.00)
- Technology: