Towards a Theoretical Understanding of Batch Normalization

Open in new window