Lightweight Convolutional Neural Networks By Hypercomplex Parameterization

Grassucci, Eleonora, Zhang, Aston, Comminiello, Danilo

arXiv.org Artificial Intelligence 

Hypercomplex neural networks have proved to reduce the overall number of parameters while ensuring valuable performances by leveraging the properties of Clifford algebras. Recently, hypercomplex linear layers have been further improved by involving efficient parameterized Kronecker products. In this paper, we define the parameterization of hypercomplex convolutional layers to develop lightweight and efficient large-scale convolutional models. Our method grasps the convolution rules and the filters organization directly from data without requiring a rigidly predefined domain structure to follow. The proposed approach is flexible to operate in any user-defined or tuned domain, from 1D to nD regardless of whether the algebra rules are preset. Such a malleability allows processing multidimensional inputs in their natural domain without annexing further dimensions, as done, instead, in quaternion neural networks for 3D inputs like color images. As a result, the proposed method operates with 1/n free parameters as regards its analog in the real domain. We demonstrate the versatility of this approach to multiple domains of application by performing experiments on various image datasets as well as audio datasets in which our method outperforms real and quaternionvalued counterparts. Recent state-of-the-art convolutional models achieved astonishing results in various fields of application by large-scaling the overall parameters amount (Karras et al., 2020; d'Ascoli et al., 2021; Dosovitskiy et al., 2021). Simultaneously, quaternion neural networks (QNNs) demonstrated to significantly reduce the number of parameters while still gaining comparable performances (Parcollet et al., 2019c; Grassucci et al., 2021a; Tay et al., 2019).