Fast and Accurate k -means++ via Rejection Sampling

Neural Information Processing Systems 

Despite its wide adoption, k -means sometimes suffers from being slow on large data-sets so a natural question has been to obtain more efficient algorithms with similar guarantees. Interestingly our algorithm obtains the same theoretical guarantees as k -means and significantly improves earlier results on fast k -means seeding. Moreover, we show empirically that our algorithm is significantly faster than k -means and obtains solutions of equivalent quality.