Reviews: Sample Complexity of Learning Mixture of Sparse Linear Regressions

Neural Information Processing Systems 

The dependence of SNR is extreme. I wonder whether it only occurs in the proof or a fundamental limitation of the approach. The authors did not provide a empirical comparison to any competing method even to [27] on which the presented algorithm improves. It would be interesting to see how the algorithm competes with the state-of-the-art in its empirical performance particularly in the presence of noise. Isn't the proof providing any dependence on L? 3. Some key definitions are missing.