Breaking the Activation Function Bottleneck through Adaptive Parameterization

Sebastian Flennerhag, Hujun Yin, John Keane, Mark Elliot

Neural Information Processing Systems 

Standard neural network architectures are non-linear only by virtue of a simple element-wise activation function, making them both brittle and excessively large. In this paper, we consider methods for making the feed-forward layer more flexible while preserving its basic structure. We develop simple drop-in replacements that learn to adapt their parameterization conditional on the input, thereby increasing statistical efficiency significantly.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found