A Strongly Consistent Sparse $k$-means Clustering with Direct $l_1$ Penalization on Variable Weights

Mar-24-2019–arXiv.org Machine Learning

We propose the Lasso Weighted $k$-means ($LW$-$k$-means) algorithm as a simple yet efficient sparse clustering procedure for high-dimensional data where the number of features ($p$) can be much larger compared to the number of observations ($n$). In the $LW$-$k$-means algorithm, we introduce a lasso-based penalty term, directly on the feature weights to incorporate feature selection in the framework of sparse clustering. $LW$-$k$-means does not make any distributional assumption of the given dataset and thus, induces a non-parametric method for feature selection. We also analytically investigate the convergence of the underlying optimization procedure in $LW$-$k$-means and establish the strong consistency of our algorithm. $LW$-$k$-means is tested on several real-life and synthetic datasets and through detailed experimental analysis, we find that the performance of the method is highly competitive against some state-of-the-art procedures for clustering and feature selection, not only in terms of clustering accuracy but also with respect to computational time.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

Mar-24-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.92)

Genre:
- Research Report (0.83)

Industry:
- Health & Medicine
  - Therapeutic Area > Oncology (1.00)
  - Pharmaceuticals & Biotechnology (0.93)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found