Reviews: Are ResNets Provably Better than Linear Predictors?

Neural Information Processing Systems 

I also tend to agree with the authors that the obtained results, relatively speaking, are significant and do shed new insights in understanding ResNet. As such I voted for acceptance (without strong opinion) although the outcome could be either... end The main goal of this work is to understand the effect of skip-connections in ResNet, through the lens of optimization. Although ResNet is strictly more powerful than simple linear regression (in the sense that linear regression is a special case of ResNet, if the weights follow a trivial pattern), its optimization may be more challenging than the linear regression special case. The authors formally ruled out this possibility by proving that any local minima of a particular ResNet architecture, or more generally any approximate stationary point, has objective value no larger than that of linear regression. However, finding such a local minima, as the authors showed through a simple example, may still be challenging.