Breaking the Activation Function Bottleneck through Adaptive Parameterization
Sebastian Flennerhag, Hujun Yin, John Keane, Mark Elliot
–Neural Information Processing Systems
Standard neural network architectures are non-linear only by virtue of a simple element-wise activation function, making them both brittle and excessively large. In this paper, we consider methods for making the feed-forward layer more flexible while preserving its basic structure. We develop simple drop-in replacements that learn to adapt their parameterization conditional on the input, thereby increasing statistical efficiency significantly.
Neural Information Processing Systems
Oct-8-2024, 03:59:12 GMT
- Technology: