On the Expressiveness of Rational ReLU Neural Networks With Bounded Depth

Averkov, Gennadiy, Hojny, Christopher, Merkert, Maximilian

arXiv.org Artificial Intelligence 

Forp = 3, this covers the cases of binary fractions as well as decimal fractions, two of the most common practical settings. Moreover, it shows that the expressive power of ReLU networks grows for every N up to O(logn). In the case of rational weights that are N-ary fractions for any fixed N, even allowing arbitrarily large denominators and arbitrary width does not facilitate exact representations of low constant depth. Theorem 4 can be viewed as a partial confirmation of Conjecture 1 for rational weights, as the term lnlnN is growing extremely slowly in N. If one could replace lnlnN by a constant, the conjecture would be confirmed for rational weights, up to a constant multiple. As already highlighted in Haase et al. (2023), confirmation of the conjecture would theoretically explain the significance of max-pooling in the context of ReLU networks: It seems that the expressive power of ReLU is not enough to model the maximum of a large number of input variables unless network architectures of high-enough depth are used. So, enhancing ReLU networks with max-pooling layers could be a way to reach higher expressive power with shallow networks.