Reviews: Global Convergence of Gradient Descent for Deep Linear Residual Networks
–Neural Information Processing Systems
The reviewers appreciated the work on the initialization even if they deemed it incremental. The experiments on the nonlinear network in the rebuttal was useful and I encourage the authors to expand the experimental section using more realistic setups to show how the theory matters in practice.
Neural Information Processing Systems
Jan-21-2025, 21:00:22 GMT
- Technology: