Dimensionality Reduction for $k$-means Clustering
Along with modern developments and the necessity of large high-dimensional datasets, which due to their nature result in overfitting of many machine learning algorithms, it is crucial that one reduces the complexity of the algorithms involving these datasets. The authors of [BDM09] are of the first to address the issue of clustering in such datasets with provably accurate approximation results, by proposing a simple pre-processing step to the " k-means" clustering algorithm; also known as Lloyd's method [Llo82] -- probably the most widely used and popular clustering algorithm.
Jul-26-2020