top eigenvector
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
4a1590df1d5968d41b855005bb8b67bf-Paper.pdf
For regression, we obtain a running time of O(nd+(nL/µ) p snL/µ) where µ > 0 is the smallest eigenvalue ofA>A. This running time improves upon the previous best unaccelerated running time of O(nd + nLd/µ). This result expands the regimes where regression can be solved in nearly linear time from whenL/µ= O(1)towhenL/µ= O(d2/3/(sn)1/3).
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Italy > Lazio > Rome (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (11 more...)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Italy > Lazio > Rome (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (11 more...)
Approximating the Top Eigenvector in Random Order Streams
When rows of an $n \times d$ matrix $A$ are given in a stream, we study algorithms for approximating the top eigenvector of $A^T A$ (equivalently, the top right singular vector of $A$). We consider worst case inputs $A$ but assume that the rows are presented to the streaming algorithm in a uniformly random order. We show that when the gap parameter $R = \sigma_1(A)^2/\sigma_2(A)^2 = \Omega(1)$, then there is a randomized algorithm that uses $O(h \cdot d \cdot \text{polylog}(d))$ bits of space and outputs a unit vector $v$ that has a correlation $1 - O(1/\sqrt{R})$ with the top eigenvector $v_1$. Here $h$ denotes the number of ``heavy rows'' in the matrix, defined as the rows with Euclidean norm at least $\|{A}\|_F/\sqrt{d \cdot \text{polylog}(d)}$. We also provide a lower bound showing that any algorithm using $O(hd/R)$ bits of space can obtain at most $1 - \Omega(1/R^2)$ correlation with the top eigenvector. Thus, parameterizing the space complexity in terms of the number of heavy rows is necessary for high accuracy solutions.Our results improve upon the $R = \Omega(\log n \cdot \log d)$ requirement in a recent work of Price. We note that Price's algorithm works for arbitrary order streams whereas our algorithm requires a stronger assumption that the rows are presented in a uniformly random order. We additionally show that the gap requirements in Price's analysis can be brought down to $R = \Omega(\log^2 d)$ for arbitrary order streams and $R = \Omega(\log d)$ for random order streams. The requirement of $R = \Omega(\log d)$ for random order streams is nearly tight for Price's analysis as we obtain a simple instance with $R = \Omega(\log d/\log\log d)$ for which Price's algorithm, with any fixed learning rate, cannot output a vector approximating the top eigenvector $v_1$.
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
PCA recovery thresholds in low-rank matrix inference with sparse noise
Adomaityte, Urte, Sicuro, Gabriele, Vivo, Pierpaolo
We study the high-dimensional inference of a rank-one signal corrupted by sparse noise. The noise is modelled as the adjacency matrix of a weighted undirected graph with finite average connectivity in the large size limit. Using the replica method from statistical physics, we analytically compute the typical value of the top eigenvalue, the top eigenvector component density, and the overlap between the signal vector and the top eigenvector. The solution is given in terms of recursive distributional equations for auxiliary probability density functions which can be efficiently solved using a population dynamics algorithm. Specialising the noise matrix to Poissonian and Random Regular degree distributions, the critical signal strength is analytically identified at which a transition happens for the recovery of the signal via the top eigenvector, thus generalising the celebrated BBP transition to the sparse noise case. In the large-connectivity limit, known results for dense noise are recovered. Analytical results are in agreement with numerical diagonalisation of large matrices.
- North America > United States > New York (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- Oceania > New Zealand (0.04)
- (4 more...)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)