AITopics | simple reparameterization

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsNov-21-2025, 15:33:34 GMT

By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.

accelerate training, simple reparameterization, weight normalization, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)

Add feedback

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsFeb-11-2025, 20:34:42 GMT

By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time.

batch normalization, simple reparameterization, weight normalization, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.64)

Add feedback

Reviews: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsJan-20-2025, 22:12:06 GMT

The suggested reparametrisation and its theoretical analysis are very interesting and I enjoyed reading the paper. However, some points in the theoretical analysis could be improved: The paper argues that the new parametrisation improves the conditioning matrix of the gradient, but neither a strong theoretical argument nor a empirical demonstration for this are given. In line 127 it is said "Empirically, we find that w is often (close to) a dominant eigenvector of the covariance matrix C", but the correspond experiments are neither shown in the paper nor in the supplemental material. In line 122/123 the authors claim "It has been observed that neural networks with batch normalization also have this property (to be relatively insensitive to different learning rates), which can be explained by this analysis.". However, it did not became clear to me, how the analysis of the previous sections can be directly transferred to batch normalisation.

gradient, neural network, simple reparameterization, (11 more...)

Neural Information Processing Systems

Genre: Research Report (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Salimans, Tim, Kingma, Durk P.

Neural Information Processing SystemsFeb-14-2020, 07:11:15 GMT

By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time.

batch normalization, simple reparameterization, weight normalization, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.64)

Add feedback

Filters

Collaborating Authors

simple reparameterization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Reviews: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks