Reviews: Global Convergence of Gradient Descent for Deep Linear Residual Networks

Neural Information Processing Systems 

The reviewers appreciated the work on the initialization even if they deemed it incremental. The experiments on the nonlinear network in the rebuttal was useful and I encourage the authors to expand the experimental section using more realistic setups to show how the theory matters in practice.