Goto

Collaborating Authors

 Lei, Zhixian


A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

arXiv.org Machine Learning

We study the algorithmic problem of estimating the mean of heavy-tailed random vector in $\mathbb{R}^d$, given $n$ i.i.d. samples. The goal is to design an efficient estimator that attains the optimal sub-gaussian error bound, only assuming that the random vector has bounded mean and covariance. Polynomial-time solutions to this problem are known but have high runtime due to their use of semi-definite programming (SDP). Conceptually, it remains open whether convex relaxation is truly necessary for this problem. In this work, we show that it is possible to go beyond SDP and achieve better computational efficiency. In particular, we provide a spectral algorithm that achieves the optimal statistical performance and runs in time $\widetilde O\left(n^2 d \right)$, improving upon the previous fastest runtime $\widetilde O\left(n^{3.5}+ n^2d\right)$ by Cherapanamjeri el al. (COLT '19) and matching the concurrent work by Depersin and Lecu\'e. Our algorithm is spectral in that it only requires (approximate) eigenvector computations, which can be implemented very efficiently by, for example, power iteration or the Lanczos method. At the core of our algorithm is a novel connection between the furthest hyperplane problem introduced by Karnin et al. (COLT '12) and a structural lemma on heavy-tailed distributions by Lugosi and Mendelson (Ann. Stat. '19). This allows us to iteratively reduce the estimation error at a geometric rate using only the information derived from the top singular vector of the data matrix, leading to a significantly faster running time.


Efficient Online Model Adaptation by Incremental Simplex Tableau

AAAI Conferences

Online multi-kernel learning is promising in the era of mobile computing, in which a combined classifier with multiple kernels are offline trained, and online adapts to personalized features for serving the end user precisely and smartly. The online adaptation is mainly carried out at the end-devices, which requires the adaptation algorithms to be light, efficient and accurate. Previous results focused mainly on efficiency. This paper proposes an novel online model adaptation framework for not only efficiency but also optimal online adaptation. At first, an online optimal incremental simplex tableau (IST)algorithm is proposed, which approaches the model adaption by linear programming and produces the optimized model update in each step when a personalized training data is collected.But keeping online optimal in each step is expensive and may cause over-fitting especially when the online data is noisy. A Fast-IST approach is therefore proposed, which measures the deviation between the training data and the current model. It schedules updating only when enough deviation is detected. The efficiency of each update is further enhanced by running IST only limited iterations, which bounds the computation complexity. Theoretical analysis and extensive evaluations show that Fast-IST saves computation cost greatly, while achieving speedy and accurate model adaptation.It provides better model adaptation speed and accuracy while using even lower computing cost than the state-of-the art.