AITopics | linear convolutional network

Collaborating Authors

linear convolutional network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models

Neural Information Processing SystemsJun-23-2026, 09:35:54 GMT

We develop an analytical framework for understanding how the generated distribution evolves during diffusion model training. Leveraging a Gaussian-equivalence principle, we solve the full-batch gradient-flow dynamics of linear and convolutional denoisers and integrate the resulting probability-flow ODE, yielding analytic expressions for the generated distribution. The theory exposes a universal inverse-variance spectral law: the time for an eigen-or Fourier mode to match its target variance scales as τ λ 1, so high-variance (coarse) structure is mastered orders of magnitude sooner than low-variance (fine) detail. Extending the analysis to deep linear networks and circulant full-width convolutions shows that weight sharing merely multiplies learning rates--accelerating but not eliminating the bias--whereas local convolution introduces a qualitatively different bias. Experiments on Gaussian and natural-image datasets confirm the spectral law persists in deep MLP-based UNet.

artificial intelligence, machine learning, variance, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.45)
Europe (0.27)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.45)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Implicit Bias of Gradient Descent on Linear Convolutional Networks

Neural Information Processing SystemsNov-20-2025, 21:47:05 GMT

We show that gradient descent on full-width linear convolutional networks of depth $L$ converges to a linear predictor related to the $\ell_{2/L}$ bridge penalty in the frequency domain. This is in contrast to linearly fully connected networks, where gradient descent converges to the hard margin linear SVM solution, regardless of depth.

gradient descent, implicit bias, linear convolutional network, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.97)

Add feedback

Implicit Bias of Gradient Descent on Linear Convolutional Networks

Suriya Gunasekar, Jason D. Lee, Daniel Soudry, Nati Srebro

Neural Information Processing SystemsNov-20-2025, 14:23:06 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, gradient descent, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.47)

Add feedback

The Riemannian Geometry associated to Gradient Flows of Linear Convolutional Networks

Achour, El Mehdi, Kohn, Kathlén, Rauhut, Holger

arXiv.org Artificial IntelligenceJul-10-2025

We study geometric properties of the gradient flow for learning deep linear convolutional networks. For linear fully connected networks, it has been shown recently that the corresponding gradient flow on parameter space can be written as a Riemannian gradient flow on function space (i.e., on the product of weight matrices) if the initialization satisfies a so-called balancedness condition. We establish that the gradient flow on parameter space for learning linear convolutional networks can be written as a Riemannian gradient flow on function space regardless of the initialization. This result holds for $D$-dimensional convolutions with $D \geq 2$, and for $D =1$ it holds if all so-called strides of the convolutions are greater than one. The corresponding Riemannian metric depends on the initialization.

artificial intelligence, machine learning, neural tangent kernel, (16 more...)

arXiv.org Artificial Intelligence

2507.06367

Country: Europe > Germany (0.46)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Implicit Bias of Gradient Descent on Linear Convolutional Networks

Neural Information Processing SystemsOct-8-2024, 14:35:45 GMT

We show that gradient descent on full-width linear convolutional networks of depth L converges to a linear predictor related to the \ell_{2/L} bridge penalty in the frequency domain. This is in contrast to linearly fully connected networks, where gradient descent converges to the hard margin linear SVM solution, regardless of depth.

gradient descent, implicit bias, linear convolutional network, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Reviews: Implicit Bias of Gradient Descent on Linear Convolutional Networks

Neural Information Processing SystemsOct-7-2024, 05:01:46 GMT

The paper considers the problem of formalizing the implicit bias of gradient descent on fully connected linear/convolutional networks with an exponential loss. Building on the recent work by Soudry et al. which considered a one layer neural network with no activation the paper generalizes the analysis to networks with greater depth (with no activations) and the exponential loss. The two main networks considered by the authors and the corresponding results are as follows. Linear Fully Connected Networks - In this setting the authors show that gradient descent in the limit converges to a predictor which in direction is the max margin predictor. This behaviour is the same as what was established in the earlier paper of Soudry et al for one layer neural networks.

convolutional network, gradient descent, implicit bias, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

Algebraic Complexity and Neurovariety of Linear Convolutional Networks

Shahverdi, Vahid

arXiv.org Artificial IntelligenceJan-29-2024

In this paper, we study linear convolutional networks with one-dimensional filters and arbitrary strides. The neuromanifold of such a network is a semialgebraic set, represented by a space of polynomials admitting specific factorizations. Introducing a recursive algorithm, we generate polynomial equations whose common zero locus corresponds to the Zariski closure of the corresponding neuromanifold. Furthermore, we explore the algebraic complexity of training these networks employing tools from metric algebraic geometry. Our findings reveal that the number of all complex critical points in the optimization of such a network is equal to the generic Euclidean distance degree of a Segre variety. Notably, this count significantly surpasses the number of critical points encountered in the training of a fully connected linear network with the same number of parameters.

architecture, critical point, polynomial, (15 more...)

arXiv.org Artificial Intelligence

2401.16613

Country:

North America > United States > Illinois (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm

Jagadeesan, Meena, Razenshteyn, Ilya, Gunasekar, Suriya

arXiv.org Machine LearningFeb-24-2021

We study the function space characterization of the inductive bias resulting from controlling the $\ell_2$ norm of the weights in linear convolutional networks. We view this in terms of an induced regularizer in the function space given by the minimum norm of weights required to realize a linear function. For two layer linear convolutional networks with $C$ output channels and kernel size $K$, we show the following: (a) If the inputs to the network have a single channel, the induced regularizer for any $K$ is a norm given by a semidefinite program (SDP) that is independent of the number of output channels $C$. We further validate these results through a binary classification task on MNIST. (b) In contrast, for networks with multi-channel inputs, multiple output channels can be necessary to merely realize all matrix-valued linear functions and thus the inductive bias does depend on $C$. Further, for sufficiently large $C$, the induced regularizer for $K=1$ and $K=D$ are the nuclear norm and the $\ell_{2,1}$ group-sparse norm, respectively, of the Fourier coefficients -- both of which promote sparse structures.

gradient descent, predictor, regularizer, (15 more...)

arXiv.org Machine Learning

2102.12238

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Implicit Bias of Gradient Descent on Linear Convolutional Networks

Gunasekar, Suriya, Lee, Jason D., Soudry, Daniel, Srebro, Nati

Neural Information Processing SystemsFeb-14-2020, 20:43:04 GMT

gradient descent, implicit bias, linear convolutional network, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Implicit Bias of Gradient Descent on Linear Convolutional Networks

Gunasekar, Suriya, Lee, Jason D., Soudry, Daniel, Srebro, Nati

Neural Information Processing SystemsDec-31-2018

Large scale neural networks used in practice are highly over-parameterized with far more trainable model parameters compared to the number of training examples. Consequently, optimization objectives for learning such high capacity models have many global minima that fit training data perfectly. However, minimizing the training loss using specific optimization algorithms take us to not just any global minima, but some special global minima, e.g., global minima minimizing some regularizer R(β). In over-parameterized models, specially deep neural networks, much, if not most, of the inductive bias of the learned model comes from this implicit regularization from the optimization algorithm. Understanding the implicit bias, e.g., via characterizing R(β), is thus essential for understanding how and what the model learns.

artificial intelligence, gradient descent, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback