Review for NeurIPS paper: Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

Neural Information Processing Systems 

Weaknesses: One of the main assumptions used in characterizing the "threshold states" at which the gradient flow dynamics appear to get trapped is that the Hessian is positive semidefinite. On the other hand, in figure 2 as the training loss crosses the threshold energy the minimal eigenvalues of the Hessian appear clearly negative, unless I am misunderstanding the figure. The authors do not appear to address this point. Could the soundness of this assumption also account for the inaccuracy of the computed value of alpha at which the phase transition occurs which can be seen in figure 4? The main result of the paper - the computation of the relative sample size alpha at which the phase transition occurs, doesn't seem to be very accurate when compared to experiments in figure 3 and 5. It would have also been helpful to plot this value in the figure in order to make this point clear. The discrepancy could be a result of finite size effects as the authors claim, but could also be a result of say the assumption made about the Hessian at the threshold states or the accuracy of the 1RSB ansatz.