Reviews: Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models
–Neural Information Processing Systems
In this paper, the authors propose Batch Renormalization technique to alleviate the problem of batchnorm when dealing with small or non-i.i.d minibatches. To reduce the dependence of large minibatch size is very important in many applications especially when training large neural network models with limited GPU memory. The proposed method is vey simple to understand and implement. And experiments show that Batch Renormalization performs well with non-i.i.d minibatches, and improves the results of small minibatches compared with batchnorm. Firstly, the authors give a clear review of batchnorm, and conclude that the key drawbacks of batchnorm are the inconsistency of mean and variance used in training and inference and the instability when dealing with small minibatches. Using moving averages to perform normalization would be the first thought, however this would lead to the model blowing up.
Neural Information Processing Systems
Oct-8-2024, 08:46:14 GMT
- Technology: