Better Neural Network Expressivity: Subdividing the Simplex
Bakaev, Egor, Brunck, Florestan, Hertrich, Christoph, Stade, Jack, Yehudayoff, Amir
–arXiv.org Artificial Intelligence
This work studies the expressivity of ReLU neural networks with a focus on their depth. A sequence of previous works showed that $\lceil \log_2(n+1) \rceil$ hidden layers are sufficient to compute all continuous piecewise linear (CPWL) functions on $\mathbb{R}^n$. Hertrich, Basu, Di Summa, and Skutella (NeurIPS'21 / SIDMA'23) conjectured that this result is optimal in the sense that there are CPWL functions on $\mathbb{R}^n$, like the maximum function, that require this depth. We disprove the conjecture and show that $\lceil\log_3(n-1)\rceil+1$ hidden layers are sufficient to compute all CPWL functions on $\mathbb{R}^n$. A key step in the proof is that ReLU neural networks with two hidden layers can exactly represent the maximum function of five inputs. More generally, we show that $\lceil\log_3(n-2)\rceil+1$ hidden layers are sufficient to compute the maximum of $n\geq 4$ numbers. Our constructions almost match the $\lceil\log_3(n)\rceil$ lower bound of Averkov, Hojny, and Merkert (ICLR'25) in the special case of ReLU networks with weights that are decimal fractions. The constructions have a geometric interpretation via polyhedral subdivisions of the simplex into ``easier'' polytopes.
arXiv.org Artificial Intelligence
Nov-10-2025
- Country:
- Europe
- Denmark > Capital Region
- Copenhagen (0.05)
- Germany
- Bavaria > Middle Franconia
- Nuremberg (0.04)
- Berlin (0.04)
- Bavaria > Middle Franconia
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Denmark > Capital Region
- North America > United States
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New York (0.04)
- Virginia (0.04)
- Massachusetts > Middlesex County
- Europe
- Genre:
- Research Report (0.64)
- Industry:
- Energy (0.46)
- Technology: