Scalable Second-order Riemannian Optimization for $K$-means Clustering
Xu, Peng, Hou, Chun-Ying, Chen, Xiaohui, Zhang, Richard Y.
–arXiv.org Artificial Intelligence
Clustering is a hard discrete optimization problem. Nonconvex approaches such as low-rank semidefinite programming (SDP) have recently demonstrated promising statistical and local algorithmic guarantees for cluster recovery. Due to the combinatorial structure of the $K$-means clustering problem, current relaxation algorithms struggle to balance their constraint feasibility and objective optimality, presenting tremendous challenges in computing the second-order critical points with rigorous guarantees. In this paper, we provide a new formulation of the $K$-means problem as a smooth unconstrained optimization over a submanifold and characterize its Riemannian structures to allow it to be solved using a second-order cubic-regularized Riemannian Newton algorithm. By factorizing the $K$-means manifold into a product manifold, we show how each Newton subproblem can be solved in linear time. Our numerical experiments show that the proposed method converges significantly faster than the state-of-the-art first-order nonnegative low-rank factorization method, while achieving similarly optimal statistical accuracy.
arXiv.org Artificial Intelligence
Sep-29-2025
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe
- Serbia > Vojvodina
- South Bačka District > Novi Sad (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Serbia > Vojvodina
- North America > United States
- California > San Diego County
- San Diego (0.04)
- Illinois (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- New York > New York County
- New York City (0.04)
- California > San Diego County
- Asia > Middle East
- Genre:
- Research Report (0.81)
- Technology: