Proxy-NormalizingActivationstoMatchBatch NormalizationwhileRemovingBatchDependence
–Neural Information Processing Systems
We find that the prototypical techniques of layer normalization and instance normalization both induce the appearance of failure modes in the neural network's pre-activations: (i) layer normalization induces a collapse towards channel-wise constant functions; (ii) instance normalization induces alackofvariability ininstance statistics, symptomatic ofanalteration of theexpressivity.
Neural Information Processing Systems
Feb-9-2026, 20:14:01 GMT