Goto

Collaborating Authors

 David Woodruff


Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss

Neural Information Processing Systems

Nevertheless, we show that under certain minimal and realistic distributional settings, it is possible to obtain a (1+ɛ)-approximation with a nearly linear running time and poly(k/ɛ) + O(k log n) columns. Namely, we show that if the input matrix A has the form A = B +E, where B is an arbitrary rank-k matrix, and E is a matrix with i.i.d.


Regularized Weighted Low Rank Approximation

Neural Information Processing Systems

The classical low rank approximation problem is to find a rank k matrix UV (where U has k columns and V has k rows) that minimizes the Frobenius norm of A UV. Although this problem can be solved efficiently, we study an NP-hard variant of this problem that involves weights and regularization. A previous paper of [Razenshteyn et al. '16] derived a polynomial time algorithm for weighted low rank approximation with constant rank. We derive provably sharper guarantees for the regularized version by obtaining parameterized complexity bounds in terms of the statistical dimension rather than the rank, allowing for a rank-independent runtime that can be significantly faster. Our improvement comes from applying sharper matrix concentration bounds, using a novel conditioning technique, and proving structural theorems for regularized low rank problems.


Total Least Squares Regression in Input Sparsity Time

Neural Information Processing Systems

In the total least squares problem, one is given an m n matrix A, and an m d matrix B, and one seeks to "correct" both A and B, obtaining matrices  and B, so that there exists an X satisfying the equation ÂX = B. Typically the problem is overconstrained, meaning that m max(n, d).



Communication-Optimal Distributed Clustering

Neural Information Processing Systems

Clustering large datasets is a fundamental problem with a number of applications in machine learning. Data is often collected on different sites and clustering needs to be performed in a distributed manner with low communication. We would like the quality of the clustering in the distributed setting to match that in the centralized setting for which all the data resides on a single site. In this work, we study both graph and geometric clustering problems in two distributed models: (1) a point-to-point model, and (2) a model with a broadcast channel. We give protocols in both models which we show are nearly optimal by proving almost matching communication lower bounds. Our work highlights the surprising power of a broadcast channel for clustering problems; roughly speaking, to spectrally cluster n points or n vertices in a graph distributed across s servers, for a worst-case partitioning the communication complexity in a point-to-point model is n s, while in the broadcast model it is n + s. A similar phenomenon holds for the geometric setting as well. We implement our algorithms and demonstrate this phenomenon on real life datasets, showing that our algorithms are also very efficient in practice.


Sublinear Time Orthogonal Tensor Decomposition

Neural Information Processing Systems

Their algorithm is based on computing sketches of the input tensor, which requires reading the entire input. We show in a number of cases one can achieve the same theoretical guarantees in sublinear time, i.e., even without reading most of the input tensor. Instead of using sketches to estimate inner products in tensor decomposition algorithms, we use importance sampling. To achieve sublinear time, we need to know the norms of tensor slices, and we show how to do this in a number of important cases.


Approximation Algorithms for $\ell_0$-Low Rank Approximation

Neural Information Processing Systems

This NPhard variant of low rank approximation is natural for problems with no underlying metric, and its goal is to minimize the number of disagreeing data positions. We provide approximation algorithms which significantly improve the running time and approximation factor of previous work.



Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?

Neural Information Processing Systems

Low-rank approximation is a common tool used to accelerate kernel methods: the n n kernel matrix K is approximated via a rank-k matrix K which can be stored in much less space and processed more quickly. In this work we study the limits of computationally efficient low-rank kernel approximation.


On Coresets for Logistic Regression

Neural Information Processing Systems

Coresets are one of the central methods to facilitate the analysis of large data. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show the negative result that no strongly sublinear sized coresets exist for logistic regression. To deal with intractable worst-case instances we introduce a complexity measure µ(X), which quantifies the hardness of compressing a data set for logistic regression.