AITopics | two-layer linear network

f39ae9ff3a81f499230c4126e01f421b-Paper.pdf

Neural Information Processing SystemsMar-13-2026, 20:41:16 GMT

assumption, implicit regularization, neural network, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

The Impact of Anisotropic Covariance Structure on the Training Dynamics and Generalization Error of Linear Networks

Watanabe, Taishi, Karakida, Ryo, Teramae, Jun-nosuke

arXiv.org Machine LearningJan-13-2026

The success of deep neural networks largely depends on the st atistical structure of the training data. While learning dynamics and generalization on iso tropic data are well-established, the impact of pronounced anisotropy on these crucial aspect s is not yet fully understood. We examine the impact of data anisotropy, represented by a sp iked covariance structure, a canonical yet tractable model, on the learning dynamics and generalization error of a two-layer linear network in a linear regression setting. Our ana lysis reveals that the learning dynamics proceed in two distinct phases, governed initiall y by the input-output correlation and subsequently by other principal directions of the data s tructure. Furthermore, we derive an analytical expression for the generalization error, quantifying how the alignment of the spike structure of the data with the learning task improv es performance. Our findings offer deep theoretical insights into how data anisotropy sha pes the learning trajectory and final performance, providing a foundation for understandin g complex interactions in more advanced network architectures.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Machine Learning

2601.06961

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

On the spectral bias of two-layer linear networks

Neural Information Processing SystemsDec-26-2025, 18:59:38 GMT

This paper studies the behaviour of two-layer fully connected networks with linear activations trained with gradient flow on the square loss. We show how the optimization process carries an implicit bias on the parameters that depends on the scale of its initialization. The main result of the paper is a variational characterization of the loss minimizers retrieved by the gradient flow for a specific initialization shape. This characterization reveals that, in the small scale initialization regime, the linear neural network's hidden layer is biased toward having a low-rank structure. To complement our results, we showcase a hidden mirror flow that tracks the dynamics of the singular values of the weights matrices and describe their time evolution. We support our findings with numerical experiments illustrating the phenomena.

name change, spectral bias, two-layer linear network, (3 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks

Neural Information Processing SystemsAug-20-2025, 09:13:37 GMT

In this work, we use the same setting as Saxe et al. [2018]: a regression problem with least-squares

assumption, implicit regularization, neural network, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

On the spectral bias of two-layer linear networks

Neural Information Processing SystemsJan-19-2025, 22:22:56 GMT

This paper studies the behaviour of two-layer fully connected networks with linear activations trained with gradient flow on the square loss. We show how the optimization process carries an implicit bias on the parameters that depends on the scale of its initialization. The main result of the paper is a variational characterization of the loss minimizers retrieved by the gradient flow for a specific initialization shape. This characterization reveals that, in the small scale initialization regime, the linear neural network's hidden layer is biased toward having a low-rank structure. To complement our results, we showcase a hidden mirror flow that tracks the dynamics of the singular values of the weights matrices and describe their time evolution.

gradient flow, spectral bias, two-layer linear network

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback