Improved Distributed Principal Component Analysis

Jan-17-2025, 18:43:00 GMT–Neural Information Processing Systems

We study the distributed computing setting in which there are multiple servers, each holding a set of points, who wish to compute functions on the union of their point sets. A key task in this setting is Principal Component Analysis (PCA), in which the servers would like to compute a low dimensional subspace capturing as much of the variance of the union of their point sets as possible. Given a procedure for approximate PCA, one can use it to approximately solve problems such as k -means clustering and low rank approximation. The essential properties of an approximate distributed PCA algorithm are its communication cost and computational efficiency for a given desired accuracy in downstream applications. We give new algorithms and analyses for distributed PCA which lead to improved communication and computational costs for k -means clustering and related problems.

artificial intelligence, machine learning, principal component analysis, (5 more...)

Neural Information Processing Systems

Jan-17-2025, 18:43:00 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.65)