[D] Training with Batch Normalization • r/MachineLearning

Dec-10-2017, 09:10:13 GMT–@machinelearnbot

Hi, despite all the alchemy which Batch Norm does behind the covariate shift, I understand it simply as normalization layer which tries to keep all activation within some prior distribution. This is especially helpful at the beginning of the training process where badly chosen initialization may lead to vanishing or exploding signal in the network. Once the network is trained the batch norm in test time uses moving averages of estimated mean and variance of training population i.e. it applies simple linear transformation. Did anyone try to compare estimated values of mean and variance, with those computes from whole training set (i.e. I'm wonder how this would affect test accuracy.

artificial intelligence, machine learning, social media, (4 more...)

@machinelearnbot

Dec-10-2017, 09:10:13 GMT

News Web Page

Add feedback

Industry:
- Media > News (0.40)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence > Machine Learning (0.82)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found