Goto

Collaborating Authors

 nnz




4a1590df1d5968d41b855005bb8b67bf-Paper.pdf

Neural Information Processing Systems

For regression, we obtain a running time of O(nd+(nL/µ) p snL/µ) where µ > 0 is the smallest eigenvalue ofA>A. This running time improves upon the previous best unaccelerated running time of O(nd + nLd/µ). This result expands the regimes where regression can be solved in nearly linear time from whenL/µ= O(1)towhenL/µ= O(d2/3/(sn)1/3).



Total Least Squares Regression in Input Sparsity Time

Huaian Diao, Zhao Song, David Woodruff, Xin Yang

Neural Information Processing Systems

In the total least squares problem, one is given an m n matrix A, and an m d matrix B, and one seeks to "correct" both A and B, obtaining matrices  and B, so that there exists an X satisfying the equation ÂX = B. Typically the problem is overconstrained, meaning that m max(n, d).





Variance Reduction for Matrix Games

Neural Information Processing Systems

We present a randomized primal-dual algorithm that solves the problem min y y^T A x to additive error epsilon in time nnz(A) + sqrt{nnz(A) n} / epsilon, for matrix A with larger dimension n and nnz(A) nonzero entries. This improves the best known exact gradient methods by a factor of sqrt{nnz(A) / n} and is faster than fully stochastic gradient methods in the accurate and/or sparse regime epsilon < sqrt{n / nnz(A)$. Our results hold for x,y in the simplex (matrix games, linear programming) and for x in an \ell_2 ball and y in the simplex (perceptron / SVM, minimum enclosing ball). Our algorithm combines the Nemirovski's conceptual prox-method and a novel reduced-variance gradient estimator based on sampling from the difference between the current iterate and a reference point.


Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?

Neural Information Processing Systems

Low-rank approximation is a common tool used to accelerate kernel methods: the $n \times n$ kernel matrix $K$ is approximated via a rank-$k$ matrix $\tilde K$ which can be stored in much less space and processed more quickly. In this work we study the limits of computationally efficient low-rank kernel approximation. We show that for a broad class of kernels, including the popular Gaussian and polynomial kernels, computing a relative error $k$-rank approximation to $K$ is at least as difficult as multiplying the input data matrix $A \in R^{n \times d}$ by an arbitrary matrix $C \in R^{d \times k}$. Barring a breakthrough in fast matrix multiplication, when $k$ is not too large, this requires $\Omega(nnz(A)k)$ time where $nnz(A)$ is the number of non-zeros in $A$. This lower bound matches, in many parameter regimes, recent work on subquadratic time algorithms for low-rank approximation of general kernels [MM16,MW17], demonstrating that these algorithms are unlikely to be significantly improved, in particular to $O(nnz(A))$ input sparsity runtimes. At the same time there is hope: we show for the first time that $O(nnz(A))$ time approximation is possible for general radial basis function kernels (e.g., the Gaussian kernel) for the closely related problem of low-rank approximation of the kernelized dataset.