Goto

Collaborating Authors

 iclr2019


14da15db887a4b50efe5c1bc66537089-AuthorFeedback.pdf

Neural Information Processing Systems

We are grateful for all the reviewers' valuable suggestions and questions. The results are displayed in Figure 1. " stands for equality up to zero-valued paddings. ICLR2019), but with the top layer to be zero. We will clarify this in the revised version.


Proxy-NormalizingActivationstoMatchBatch NormalizationwhileRemovingBatchDependence

Neural Information Processing Systems

We find that the prototypical techniques of layer normalization and instance normalization both induce the appearance of failure modes in the neural network's pre-activations: (i) layer normalization induces a collapse towards channel-wise constant functions; (ii) instance normalization induces alackofvariability ininstance statistics, symptomatic ofanalteration of theexpressivity.