Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Tim Salimans, Durk P. Kingma

Neural Information Processing Systems 

Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch.