krasulina
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Nevada (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (6 more...)
Exponentially convergent stochastic k-PCA without variance reduction
We show, both theoretically and empirically, that the algorithm naturally adapts to data low-rankness and converges exponentially fast to the ground-truth principal subspace. Notably, our result suggests that despite various recent efforts to accelerate the convergence of stochastic-gradient based methods by adding a O(n)-time variance reduction step, for the k-PCA problem, a truly online SGD variant suffices to achieve exponential convergence on intrinsically low-rank data.
- North America > United States > New York > New York County > New York City (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (9 more...)
Exponentially convergent stochastic k-PCA without variance reduction
We show, both theoretically and empirically, that the algorithm naturally adapts to data low-rankness and converges exponentially fast to the ground-truth principal subspace. Notably, our result suggests that despite various recent efforts to accelerate the convergence of stochastic-gradient based methods by adding a O(n)-time variance reduction step, for the k- PCA problem, a truly online SGD variant suffices to achieve exponential convergence on intrinsically low-rank data.
- North America > United States > California > San Diego County > San Diego (0.05)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component Analysis
Gang, Arpita, Bajwa, Waheed U.
Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning. While PCA is often thought of as a dimensionality reduction method, the purpose of PCA is actually two-fold: dimension reduction and uncorrelated feature learning. Furthermore, the enormity of the dimensions and sample size in the modern day datasets have rendered the centralized PCA solutions unusable. In that vein, this paper reconsiders the problem of PCA when data samples are distributed across nodes in an arbitrarily connected network. While a few solutions for distributed PCA exist, those either overlook the uncorrelated feature learning aspect of the PCA, tend to have high communication overhead that makes them inefficient and/or lack `exact' or `global' convergence guarantees. To overcome these aforementioned issues, this paper proposes a distributed PCA algorithm termed FAST-PCA (Fast and exAct diSTributed PCA). The proposed algorithm is efficient in terms of communication and is proven to converge linearly and exactly to the principal components, leading to dimension reduction as well as uncorrelated features. The claims are further supported by experimental results.
- North America > United States > New York (0.04)
- North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Europe > United Kingdom > England > East Sussex > Brighton (0.04)
EigenGame Unloaded: When playing games is better than optimizing
Gemp, Ian, McWilliams, Brian, Vernade, Claire, Graepel, Thore
We build on the recently proposed EigenGame that views eigendecomposition as a competitive game. EigenGame's updates are biased if computed using minibatches of data, which hinders convergence and more sophisticated parallelism in the stochastic setting. In this work, we propose an unbiased stochastic update that is asymptotically equivalent to EigenGame, enjoys greater parallelism allowing computation on datasets of larger sample sizes, and outperforms EigenGame in experiments. We present applications to finding the principal components of massive datasets and performing spectral clustering of graphs. We analyze and discuss our proposed update in the context of EigenGame and the shift in perspective from optimization to games.
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Communications > Social Media (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Exponentially convergent stochastic k-PCA without variance reduction
We show, both theoretically and empirically, that the algorithm naturally adapts to data low-rankness and converges exponentially fast to the ground-truth principal subspace. Notably, our result suggests that despite various recent efforts to accelerate the convergence of stochastic-gradient based methods by adding a O(n)-time variance reduction step, for the k- PCA problem, a truly online SGD variant suffices to achieve exponential convergence on intrinsically low-rank data. Papers published at the Neural Information Processing Systems Conference.
An Implicit Form of Krasulina's k-PCA Update without the Orthonormality Constraint
Amid, Ehsan, Warmuth, Manfred K.
We shed new insights on the two commonly used updates for the online $k$-PCA problem, namely, Krasulina's and Oja's updates. We show that Krasulina's update corresponds to a projected gradient descent step on the Stiefel manifold of the orthonormal $k$-frames, while Oja's update amounts to a gradient descent step using the unprojected gradient. Following these observations, we derive a more \emph{implicit} form of Krasulina's $k$-PCA update, i.e. a version that uses the information of the future gradient as much as possible. Most interestingly, our implicit Krasulina update avoids the costly QR-decomposition step by bypassing the orthonormality constraint. We show that the new update in fact corresponds to an online EM step applied to a probabilistic $k$-PCA model. The probabilistic view of the updates allows us to combine multiple models in a distributed setting. We show experimentally that the implicit Krasulina update yields superior convergence while being significantly faster. We also give strong evidence that the new update can benefit from parallelism and is more stable w.r.t. tuning of the learning rate.
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (3 more...)
Exponentially convergent stochastic k-PCA without variance reduction
We show, both theoretically and empirically, that the algorithm naturally adapts to data low-rankness and converges exponentially fast to the ground-truth principal subspace. Notably, our result suggests that despite various recent efforts to accelerate the convergence of stochastic-gradient based methods by adding a O(n)-time variance reduction step, for the k-PCA problem, a truly online SGD variant suffices to achieve exponential convergence on intrinsically low-rank data.
- North America > United States > New York > New York County > New York City (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (8 more...)