Multi-Stage Dantzig Selector

Liu, Ji, Wonka, Peter, Ye, Jieping

Neural Information Processing Systems 

We consider the following sparse signal recovery (or feature selection) problem: given a design matrix $X\in \mathbb{R} {n\times m}$ $(m\gg n)$ and a noisy observation vector $y\in \mathbb{R} {n}$ satisfying $y X\beta * \epsilon$ where $\epsilon$ is the noise vector following a Gaussian distribution $N(0,\sigma 2I)$, how to recover the signal (or parameter vector) $\beta *$ when the signal is sparse? The Dantzig selector has been proposed for sparse signal recovery with strong theoretical guarantees. In this paper, we propose a multi-stage Dantzig selector method, which iteratively refines the target signal $\beta *$. We show that if $X$ obeys a certain condition, then with a large probability the difference between the solution $\hat\beta$ estimated by the proposed method and the true solution $\beta *$ measured in terms of the $l_p$ norm ($p\geq 1$) is bounded as \begin{equation*} \ \hat\beta-\beta *\ _p\leq \left(C(s-N) {1/p}\sqrt{\log m} \Delta\right)\sigma, \end{equation*} $C$ is a constant, $s$ is the number of nonzero entries in $\beta *$, $\Delta$ is independent of $m$ and is much smaller than the first term, and $N$ is the number of entries of $\beta *$ larger than a certain value in the order of $\mathcal{O}(\sigma\sqrt{\log m})$. The proposed method improves the estimation bound of the standard Dantzig selector approximately from $Cs {1/p}\sqrt{\log m}\sigma$ to $C(s-N) {1/p}\sqrt{\log m}\sigma$ where the value $N$ depends on the number of large entries in $\beta *$.