AITopics

2505.11846

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Israel (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Marchetti, Giovanni Luca, Shahverdi, Vahid, Mereta, Stefano, Trager, Matthew, Kohn, Kathlén

An Invitation to Neuroalgebraic Geometry

arXiv.org Artificial IntelligenceJan-31-2025

In this expository work, we promote the study of function spaces parameterized by machine learning models through the lens of algebraic geometry. To this end, we focus on algebraic models, such as neural networks with polynomial activations, whose associated function spaces are semi-algebraic varieties. We outline a dictionary between algebro-geometric invariants of these varieties, such as dimension, degree, and singularities, and fundamental aspects of machine learning, such as sample complexity, expressivity, training dynamics, and implicit bias. Along the way, we review the literature and discuss ideas beyond the algebraic domain. This work lays the foundations of a research direction bridging algebraic geometry and deep learning, that we refer to as neuroalgebraic geometry.

artificial intelligence, deep learning, machine learning, (14 more...)

2501.18915

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Shahverdi, Vahid, Marchetti, Giovanni Luca, Kohn, Kathlén

On the Geometry and Optimization of Polynomial Convolutional Networks

arXiv.org Artificial IntelligenceOct-1-2024

We study convolutional neural networks with monomial activation functions. Specifically, we prove that their parameterization map is regular and is an isomorphism almost everywhere, up to rescaling the filters. By leveraging on tools from algebraic geometry, we explore the geometric properties of the image in function space of this map -- typically referred to as neuromanifold. In particular, we compute the dimension and the degree of the neuromanifold, which measure the expressivity of the model, and describe its singularities. Moreover, for a generic large dataset, we derive an explicit formula that quantifies the number of critical points arising in the optimization of a regression loss.

distance degree, euclidean distance degree, neuromanifold, (14 more...)

2410.00722

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Kubjas, Kaie, Li, Jiayi, Wiesmann, Maximilian

Geometry of Polynomial Neural Networks

arXiv.org Artificial IntelligenceFeb-1-2024

We study the expressivity and learning process for polynomial neural networks (PNNs) with monomial activation functions. The weights of the network parametrize the neuromanifold. In this paper, we study certain neuromanifolds using tools from algebraic geometry: we give explicit descriptions as semialgebraic sets and characterize their Zariski closures, called neurovarieties. We study their dimension and associate an algebraic degree, the learning degree, to the neurovariety. The dimension serves as a geometric measure for the expressivity of the network, the learning degree is a measure for the complexity of training the network and provides upper bounds on the number of learnable functions. These theoretical results are accompanied with experiments.

dimension, neural network, polynomial neural network, (15 more...)

2402.00949

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceJan-29-2024

Algebraic Complexity and Neurovariety of Linear Convolutional Networks

Shahverdi, Vahid

In this paper, we study linear convolutional networks with one-dimensional filters and arbitrary strides. The neuromanifold of such a network is a semialgebraic set, represented by a space of polynomials admitting specific factorizations. Introducing a recursive algorithm, we generate polynomial equations whose common zero locus corresponds to the Zariski closure of the corresponding neuromanifold. Furthermore, we explore the algebraic complexity of training these networks employing tools from metric algebraic geometry. Our findings reveal that the number of all complex critical points in the optimization of such a network is equal to the generic Euclidean distance degree of a Segre variety. Notably, this count significantly surpasses the number of critical points encountered in the training of a fully connected linear network with the same number of parameters.

architecture, critical point, polynomial, (15 more...)

2401.16613

Country:

North America > United States > Illinois (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Neural Information Processing SystemsApr-6-2023, 16:43:04 GMT

Geometrical Singularities in the Neuromanifold of Multilayer Perceptrons

Singularities are ubiquitous in the parameter space of hierarchical models such as multilayer perceptrons. At singularities, the Fisher information matrix degenerates, and the Cramer-Rao paradigm does no more hold, implying that the classical model selection the(cid:173) ory such as AIC and MDL cannot be applied. It is important to study the relation between the generalization error and the training error at singularities. The present paper demonstrates a method of analyzing these errors both for the maximum likelihood estima(cid:173) tor and the Bayesian predictive distribution in terms of Gaussian random fields, by using simple models.

geometrical singularity, multilayer perceptron, neuromanifold

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Amari, Shun-ichi, Park, Hyeyoung, Ozeki, Tomoko

Geometrical Singularities in the Neuromanifold of Multilayer Perceptrons

Neural Information Processing SystemsDec-31-2002

Singularities are ubiquitous in the parameter space of hierarchical models such as multilayer perceptrons. At singularities, the Fisher information matrix degenerates, and the Cramer-Rao paradigm does no more hold, implying that the classical model selection theory such as AIC and MDL cannot be applied. It is important to study the relation between the generalization error and the training error at singularities. The present paper demonstrates a method of analyzing these errors both for the maximum likelihood estimator and the Bayesian predictive distribution in terms of Gaussian random fields, by using simple models. 1 Introduction A neural network is specified by a number of parameters which are synaptic weights and biases. Learning takes place by modifying these parameters from observed input-output examples.

estimator, predictive distribution, singularity, (13 more...)

Country: Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Amari, Shun-ichi, Park, Hyeyoung, Ozeki, Tomoko

Geometrical Singularities in the Neuromanifold of Multilayer Perceptrons

Neural Information Processing SystemsDec-31-2002

Singularities are ubiquitous in the parameter space of hierarchical models such as multilayer perceptrons. At singularities, the Fisher information matrix degenerates, and the Cramer-Rao paradigm does no more hold, implying that the classical model selection theory such as AIC and MDL cannot be applied. It is important to study the relation between the generalization error and the training error at singularities. The present paper demonstrates a method of analyzing these errors both for the maximum likelihood estimator and the Bayesian predictive distribution in terms of Gaussian random fields, by using simple models. 1 Introduction A neural network is specified by a number of parameters which are synaptic weights and biases. Learning takes place by modifying these parameters from observed input-output examples.

estimator, predictive distribution, singularity, (13 more...)

Country: Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Amari, Shun-ichi, Park, Hyeyoung, Ozeki, Tomoko

Geometrical Singularities in the Neuromanifold of Multilayer Perceptrons

Neural Information Processing SystemsDec-31-2002

Singularities are ubiquitous in the parameter space of hierarchical models such as multilayer perceptrons. At singularities, the Fisher information matrix degenerates, and the Cramer-Rao paradigm does no more hold, implying that the classical model selection theory suchas AIC and MDL cannot be applied. It is important to study the relation between the generalization error and the training error at singularities. The present paper demonstrates a method of analyzing these errors both for the maximum likelihood estimator andthe Bayesian predictive distribution in terms of Gaussian random fields, by using simple models. 1 Introduction A neural network is specified by a number of parameters which are synaptic weights and biases. Learning takes place by modifying these parameters from observed input-output examples.

artificial intelligence, machine learning, predictive distribution, (15 more...)

Country: Asia > Japan (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)