Nearly Optimal Dynamic $k$-Means Clustering for High-Dimensional Data
Hu, Wei, Song, Zhao, Yang, Lin F., Zhong, Peilin
We consider the $k$-means clustering problem in the dynamic streaming setting, where points from a discrete Euclidean space $\{1, 2, \ldots, \Delta\}^d$ can be dynamically inserted to or deleted from the dataset. For this problem, we provide a one-pass coreset construction algorithm using space $\tilde{O}(k\cdot \mathrm{poly}(d, \log\Delta))$, where $k$ is the target number of centers. To our knowledge, this is the first dynamic geometric data stream algorithm for $k$-means using space polynomial in dimension and nearly optimal (linear) in $k$.
Feb-7-2019
- Country:
- North America > United States
- California > Santa Clara County > San Jose (0.04)
- Asia > Afghanistan
- Parwan Province > Charikar (0.04)
- North America > United States
- Genre:
- Research Report (0.64)
- Technology: