Multi-Swap k-Means++
–Neural Information Processing Systems
The k -means algorithm of Arthur and Vassilvitskii (SODA 2007) is often the practitioners' choice algorithm for optimizing the popular k -means clustering objective and is known to give an O(\log k) -approximation in expectation. To obtain higher quality solutions, Lattanzi and Sohler (ICML 2019) proposed augmenting k -means with O(k \log \log k) local-search steps obtained through the k -means sampling distribution to yield a c -approximation to the k -means clustering problem, where c is a large absolute constant. Here we generalize and extend their local-search algorithm by considering larger and more sophisticated local-search neighborhoods hence allowing to swap multiple centers at the same time. Our algorithm achieves a 9 \varepsilon approximation ratio, which is the best possible for local search. Importantly we show that our algorithm is practical, namely easy to implement and fast enough to run on a variety of classic datasets, and outputs solutions of better cost.
Neural Information Processing Systems
Jan-18-2025, 04:41:35 GMT
- Technology: