FameSVD: Fast and Memory-efficient Singular Value Decomposition
Li, Xiaocan, Wang, Shuo, Cai, Yinghao
Dimensionality reduction has always been a trendy topic in machine learning. Linear subspace method for reduction, e.g., Principal Component Analysis and its variation have been widely studied[21, 16, 23], and some pieces of literature introduce probability and randomness to realize PCA[18, 12, 11, 14, 20]. However, linear subspace is not applicable when the data lies in a nonlinear manifold[2, 15]. Due to the direct connection with PCA, Singular Value Decomposition (SVD) is one of the most well-known algorithms for low-rank approximation[19, 9], and it has been widely used throughout the machine learning and statistics community. Some implementations of SVD are solving least squares[10, 22], latent semantic analysis[7, 13], genetic analysis, matrix completion[17, 3, 5, 4], data mining[6, 1] etc. However, when it comes to a large scale matrix, the runtime of traditional SVD is intolerable and the memory usage could be enormously consuming. Notations: we have a matrix A with size m n, usually m n. Our goal is to find the Principal Components (PCs) given the cumulative explained variance threshold t.
Jun-28-2019