The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers

Open in new window