RepAct: The Re-parameterizable Adaptive Activation Function
Wu, Xian, Tao, Qingchuan, Wang, Shuang
–arXiv.org Artificial Intelligence
Since the inception of the neural network resurgence with AlexNet[1], the design of activation functions has garnered continuous attention [2]. Activation functions introduce non-linearity into the networks and play a critical role in feature extraction. In traditional neural network designs, the selection and design of activation functions for different layers and networks are typically based on manual design experience [3, 4] or adapted through NAS (Neural Architecture Search) [5, 6]. The emergence of adaptive activation function [6, 7, 8] effectively improves network performance. Subsequently, various dynamic adaptive parameters were introduced from different dimensions of the feature graph [9, 10, 11]. Although the introduced parameters and computation amount were small, the memory cost of element by element operation often formed a bottleneck of lightweight network reasoning [12]. When deploying lightweight networks on resource-constrained edge devices, real-time performance requirements impose strict limitations on model parameters, computational power, and memory operations [12, 13, 14, 15]. Convolutional neural networks exhibit sparsity in their activations [16], which prevents lightweight networks from fully utilizing model capacity to learn features from task data. We have noticed that re-parameterizable convolutional structures [17, 18, 19, 20, 21], by virtue of their multi-branch architecture during training, enhance the network's feature capturing ability.
arXiv.org Artificial Intelligence
Jun-28-2024