Goto

Collaborating Authors

 heavy hitter





LinearandKernelClassificationintheStreaming Model: ImprovedBoundsforHeavyHitters

Neural Information Processing Systems

We consider logistic regression, and more generally, linear classification, in the streaming model. In our setting, we are given a dataset consisting ofT examples (xt,yt), where t [T], xt Rd, yt { 1,1}. The examples arrive one by one, and moreover, the nonzero coordinates of each examplext arrive one by one.


Linear and Kernel Classification in the Streaming Model: Improved Bounds for Heavy Hitters

Neural Information Processing Systems

We study linear and kernel classification in the streaming model. For linear classification, we improve upon the algorithm of (Tai, et al. 2018), which solves the $\ell_1$ point query problem on the optimal weight vector $w_* \in \mathbb{R}^d$ in sublinear space. We first give an algorithm solving the more difficult $\ell_2$ point query problem on $w_*$, also in sublinear space. We also give an algorithm which solves the $\ell_2$ heavy hitter problem on $w_*$, in sublinear space and running time. Finally, we give an algorithm which can $\textit{deterministically}$ solve the $\ell_1$ point query problem on $w_*$, with sublinear space improving upon that of (Tai, et al. 2018). For kernel classification, if $w_* \in \mathbb{R}^{d^p}$ is the optimal weight vector classifying points in the stream according to their $p^{th}$-degree polynomial kernel, then we give an algorithm solving the $\ell_2$ point query problem on $w_*$ in $\text{poly}(\frac{p \log d}{\varepsilon})$ space, and an algorithm solving the $\ell_2$ heavy hitter problem in $\text{poly}(\frac{p \log d}{\varepsilon})$ space and running time. Note that our space and running time are polynomial in $p$, making our algorithms well-suited to high-degree polynomial kernels and the Gaussian kernel (approximated by the polynomial kernel of degree $p = \Theta(\log T)$).


Practical Locally Private Heavy Hitters

Raef Bassily, Kobbi Nissim, Uri Stemmer, Abhradeep Guha Thakurta

Neural Information Processing Systems

With a typically large number of participants in local algorithms ( n in the millions), this reduction in time complexity, in particular at the user side, is crucial for the use of such algorithms in practice. We implemented Algorithm TreeHist to verify our theoretical analysis and compared its performance with the performance of Google's RAPPOR code.