Breaking the Activation Function Bottleneck through Adaptive Parameterization
Sebastian Flennerhag, Hujun Yin, John Keane, Mark Elliot
–Neural Information Processing Systems
Adaptive parameterization is a means of increasing this flexibility and thereby increasing the model's capacity to learn non-linear patterns. We focus on the feed-forward layer, f(x):= φ(W x+b),for some activation functionφ: R 7 R. Define the pre-activation layer as a = A(x):= Wx+band denote byg(a):= φ(a)/athe activation effect ofφgivena, where divisioniselement-wise.
Neural Information Processing Systems
Feb-14-2026, 14:19:32 GMT