Reviews: Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced

Neural Information Processing Systems 

Edit after author feedback, I do not wish to change my score: - To me balanced positive quantities is not only about their difference. They should have similar order of magnitude, the difference between 1e-13 and 1 is pretty small but they are clearly unbalanced. The same goes for "Theta" and "poly" notation. None of the statements of the paper involving these notations has this feature: the variables epsilon, d, d1 and d2 are fixed and are never quantified with a "forall" quantifier. The fact that a notation is standard does not mean that it cannot be misused. The authors consider deep learning models with a specific class of activation functions which ensures that the model remains homogeneous: multiplying the weight of a layer by a positive scalar and dividing the weights of another layer by the same amount does not change the prediction of the network.