f5ccb3ab757131a93586ef61ec701533-Supplemental-Conference.pdf
–Neural Information Processing Systems
In this section, we compare the symmetric solutions found in erf [2] and ReLU networks [5] to our one-neuron solution (n =1). The main difference is that both earlier studies constrain the search space to the symmetric subspace whereas we first prove that the non-trivial critical points are contained in this subspace in Theorem 5.1 for a broad class of activation functions, including erf and ReLU. Solving the low-dimensional loss, we recover the same solution for ReLU and erf as in [2, 5] for unit-orthonormal teachers.
Neural Information Processing Systems
Apr-30-2026, 08:09:14 GMT
- Technology: