iclr2019
Proxy-NormalizingActivationstoMatchBatch NormalizationwhileRemovingBatchDependence
We find that the prototypical techniques of layer normalization and instance normalization both induce the appearance of failure modes in the neural network's pre-activations: (i) layer normalization induces a collapse towards channel-wise constant functions; (ii) instance normalization induces alackofvariability ininstance statistics, symptomatic ofanalteration of theexpressivity.
Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)