Goto

Collaborating Authors

 neuromanifold


Learning on a Razor's Edge: the Singularity Bias of Polynomial Neural Networks

arXiv.org Artificial Intelligence

In this work, we theoretically analyze sub-networks and their bias through the lens of algebraic geometry. We consider fully-connected networks with polynomial activation functions, and focus on the geometry of the function space they parametrize, often referred to as neuroman-ifold. First, we compute the dimension of the subspace of the neuromanifold parametrized by subnetworks. Second, we show that this subspace is singular. Third, we argue that such singularities often correspond to critical points of the training dynamics. Lastly, we discuss convolutional networks, for which subnet-works and singularities are similarly related, but the bias does not arise.Figure 1: Subnetworks define singular points (orange) of the neuromanifold.


An Invitation to Neuroalgebraic Geometry

arXiv.org Artificial Intelligence

In this expository work, we promote the study of function spaces parameterized by machine learning models through the lens of algebraic geometry. To this end, we focus on algebraic models, such as neural networks with polynomial activations, whose associated function spaces are semi-algebraic varieties. We outline a dictionary between algebro-geometric invariants of these varieties, such as dimension, degree, and singularities, and fundamental aspects of machine learning, such as sample complexity, expressivity, training dynamics, and implicit bias. Along the way, we review the literature and discuss ideas beyond the algebraic domain. This work lays the foundations of a research direction bridging algebraic geometry and deep learning, that we refer to as neuroalgebraic geometry.


On the Geometry and Optimization of Polynomial Convolutional Networks

arXiv.org Artificial Intelligence

We study convolutional neural networks with monomial activation functions. Specifically, we prove that their parameterization map is regular and is an isomorphism almost everywhere, up to rescaling the filters. By leveraging on tools from algebraic geometry, we explore the geometric properties of the image in function space of this map -- typically referred to as neuromanifold. In particular, we compute the dimension and the degree of the neuromanifold, which measure the expressivity of the model, and describe its singularities. Moreover, for a generic large dataset, we derive an explicit formula that quantifies the number of critical points arising in the optimization of a regression loss.


Geometry of Polynomial Neural Networks

arXiv.org Artificial Intelligence

We study the expressivity and learning process for polynomial neural networks (PNNs) with monomial activation functions. The weights of the network parametrize the neuromanifold. In this paper, we study certain neuromanifolds using tools from algebraic geometry: we give explicit descriptions as semialgebraic sets and characterize their Zariski closures, called neurovarieties. We study their dimension and associate an algebraic degree, the learning degree, to the neurovariety. The dimension serves as a geometric measure for the expressivity of the network, the learning degree is a measure for the complexity of training the network and provides upper bounds on the number of learnable functions. These theoretical results are accompanied with experiments.


Algebraic Complexity and Neurovariety of Linear Convolutional Networks

arXiv.org Artificial Intelligence

In this paper, we study linear convolutional networks with one-dimensional filters and arbitrary strides. The neuromanifold of such a network is a semialgebraic set, represented by a space of polynomials admitting specific factorizations. Introducing a recursive algorithm, we generate polynomial equations whose common zero locus corresponds to the Zariski closure of the corresponding neuromanifold. Furthermore, we explore the algebraic complexity of training these networks employing tools from metric algebraic geometry. Our findings reveal that the number of all complex critical points in the optimization of such a network is equal to the generic Euclidean distance degree of a Segre variety. Notably, this count significantly surpasses the number of critical points encountered in the training of a fully connected linear network with the same number of parameters.


Geometrical Singularities in the Neuromanifold of Multilayer Perceptrons

Neural Information Processing Systems

Singularities are ubiquitous in the parameter space of hierarchical models such as multilayer perceptrons. At singularities, the Fisher information matrix degenerates, and the Cramer-Rao paradigm does no more hold, implying that the classical model selection the(cid:173) ory such as AIC and MDL cannot be applied. It is important to study the relation between the generalization error and the training error at singularities. The present paper demonstrates a method of analyzing these errors both for the maximum likelihood estima(cid:173) tor and the Bayesian predictive distribution in terms of Gaussian random fields, by using simple models.


Geometrical Singularities in the Neuromanifold of Multilayer Perceptrons

Neural Information Processing Systems

Singularities are ubiquitous in the parameter space of hierarchical models such as multilayer perceptrons. At singularities, the Fisher information matrix degenerates, and the Cramer-Rao paradigm does no more hold, implying that the classical model selection theory such as AIC and MDL cannot be applied. It is important to study the relation between the generalization error and the training error at singularities. The present paper demonstrates a method of analyzing these errors both for the maximum likelihood estimator and the Bayesian predictive distribution in terms of Gaussian random fields, by using simple models. 1 Introduction A neural network is specified by a number of parameters which are synaptic weights and biases. Learning takes place by modifying these parameters from observed input-output examples.


Geometrical Singularities in the Neuromanifold of Multilayer Perceptrons

Neural Information Processing Systems

Singularities are ubiquitous in the parameter space of hierarchical models such as multilayer perceptrons. At singularities, the Fisher information matrix degenerates, and the Cramer-Rao paradigm does no more hold, implying that the classical model selection theory such as AIC and MDL cannot be applied. It is important to study the relation between the generalization error and the training error at singularities. The present paper demonstrates a method of analyzing these errors both for the maximum likelihood estimator and the Bayesian predictive distribution in terms of Gaussian random fields, by using simple models. 1 Introduction A neural network is specified by a number of parameters which are synaptic weights and biases. Learning takes place by modifying these parameters from observed input-output examples.


Geometrical Singularities in the Neuromanifold of Multilayer Perceptrons

Neural Information Processing Systems

Singularities are ubiquitous in the parameter space of hierarchical models such as multilayer perceptrons. At singularities, the Fisher information matrix degenerates, and the Cramer-Rao paradigm does no more hold, implying that the classical model selection theory suchas AIC and MDL cannot be applied. It is important to study the relation between the generalization error and the training error at singularities. The present paper demonstrates a method of analyzing these errors both for the maximum likelihood estimator andthe Bayesian predictive distribution in terms of Gaussian random fields, by using simple models. 1 Introduction A neural network is specified by a number of parameters which are synaptic weights and biases. Learning takes place by modifying these parameters from observed input-output examples.