Reviews: Global Optimality of Local Search for Low Rank Matrix Recovery

Neural Information Processing Systems 

This is a nice result. I am going to list a few nits I had about the paper as I read along. I think addressing some of these points would improve the presentation of the paper. There are a few cases which are not covered by the results. For instance, strict-saddle in noisy case local min are close to global in high rank, noisy case. A discussion about why these cases are not covered would be nice; I am assuming that it is not just straightforward modification of the current proof? 2. In practice, I believe that random init gradient descent without noise is sufficient.