Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes Maxim Kodryan

Neural Information Processing Systems 

ELR may obscure certain characteristics of the intrinsic loss landscape structure.