Goto

Collaborating Authors

 breakline





Supplemental: Training Fully Connected Neural Networks is R-Complete A R-Membership Membership in R is already proven by Abrahamsen, Kleist and Miltzow in [

Neural Information Processing Systems

F or each line ℓ L the change of the gradient of f when crossing ℓ is constant along ℓ. Then there is a fully connected two-layer neural network with m hidden neurons computing f . To see that this observation is true, consider the following construction. Describing all gadgets purely by their data points is tedious and obscures the relatively simple geometry enforced by these data points. A weak data point relaxes a regular data point and prescribes only a lower bound on the value of the label.


Training Neural Networks is NP-Hard in Fixed Dimension

Froese, Vincent, Hertrich, Christoph

arXiv.org Artificial Intelligence

We study the parameterized complexity of training two-layer neural networks with respect to the dimension of the input data and the number of hidden neurons, considering ReLU and linear threshold activation functions. Albeit the computational complexity of these problems has been studied numerous times in recent years, several questions are still open. We answer questions by Arora et al. [ICLR '18] and Khalife and Basu [IPCO '22] showing that both problems are NP-hard for two dimensions, which excludes any polynomial-time algorithm for constant dimension. We also answer a question by Froese et al. [JAIR '22] proving W[1]-hardness for four ReLUs (or two linear threshold neurons) with zero training error. Finally, in the ReLU case, we show fixed-parameter tractability for the combined parameter number of dimensions and number of ReLUs if the network is assumed to compute a convex map. Our results settle the complexity status regarding these parameters almost completely.


Training Fully Connected Neural Networks is $\exists\mathbb{R}$-Complete

Bertschinger, Daniel, Hertrich, Christoph, Jungeblut, Paul, Miltzow, Tillmann, Weber, Simon

arXiv.org Artificial Intelligence

We consider the algorithmic problem of finding the optimal weights and biases for a two-layer fully connected neural network to fit a given set of data points. This problem is known as empirical risk minimization in the machine learning community. We show that the problem is $\exists\mathbb{R}$-complete. This complexity class can be defined as the set of algorithmic problems that are polynomial-time equivalent to finding real roots of a polynomial with integer coefficients. Furthermore, we show that arbitrary algebraic numbers are required as weights to be able to train some instances to optimality, even if all data points are rational. Our results hold even if the following restrictions are all added simultaneously. $\bullet$ There are exactly two output neurons. $\bullet$ There are exactly two input neurons. $\bullet$ The data has only 13 different labels. $\bullet$ The number of hidden neurons is a constant fraction of the number of data points. $\bullet$ The target training error is zero. $\bullet$ The ReLU activation function is used. This shows that even very simple networks are difficult to train. The result explains why typical methods for $\mathsf{NP}$-complete problems, like mixed-integer programming or SAT-solving, cannot train neural networks to global optimality, unless $\mathsf{NP}=\exists\mathbb{R}$. We strengthen a recent result by Abrahamsen, Kleist and Miltzow [NeurIPS 2021].