Reviews: Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced

Oct-8-2024, 10:57:26 GMT–Neural Information Processing Systems

Edit after author feedback, I do not wish to change my score: - To me balanced positive quantities is not only about their difference. They should have similar order of magnitude, the difference between 1e-13 and 1 is pretty small but they are clearly unbalanced. The same goes for "Theta" and "poly" notation. None of the statements of the paper involving these notations has this feature: the variables epsilon, d, d1 and d2 are fixed and are never quantified with a "forall" quantifier. The fact that a notation is standard does not mean that it cannot be misused. The authors consider deep learning models with a specific class of activation functions which ensures that the model remains homogeneous: multiplying the weight of a layer by a positive scalar and dividing the weights of another layer by the same amount does not change the prediction of the network.

matrix factorization, notation, smoothness, (13 more...)

Neural Information Processing Systems

Oct-8-2024, 10:57:26 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)