Goto

Collaborating Authors

 eigenmode


Implicit variance regularization in non-contrastive SSL

Neural Information Processing Systems

In this work, we provide a comparative analysis of the learning dynamics for the Euclidean and cosine-based asymmetric losses in the eigenspace of the closed-form predictor DirectPred.



a57ecd54d4df7d999bd9c5e3b973ec75-Supplemental.pdf

Neural Information Processing Systems

Wecanseethis as the slope of the update function changes (middle row of Figure 1), these green lines correspond tothelocations givenbythearrowsinthetoprow.





An Analytical Theory of Power Law Spectral Bias in the Learning Dynamics of Diffusion Models

Wang, Binxu

arXiv.org Machine Learning

We developed an analytical framework for understanding how the learned distribution evolves during diffusion model training. Leveraging the Gaussian equivalence principle, we derived exact solutions for the gradient-flow dynamics of weights in one- or two-layer linear denoiser settings with arbitrary data. Remarkably, these solutions allowed us to derive the generated distribution in closed form and its KL divergence through training. These analytical results expose a pronounced power-law spectral bias, i.e., for weights and distributions, the convergence time of a mode follows an inverse power law of its variance. Empirical experiments on both Gaussian and image datasets demonstrate that the power-law spectral bias remains robust even when using deeper or convolutional architectures. Our results underscore the importance of the data covariance in dictating the order and rate at which diffusion models learn different modes of the data, providing potential explanations for why earlier stopping could lead to incorrect details in image generative models.


ModeConv: A Novel Convolution for Distinguishing Anomalous and Normal Structural Behavior

Schaller, Melanie, Schlör, Daniel, Hotho, Andreas

arXiv.org Artificial Intelligence

External influences such as traffic and environmental factors induce vibrations in structures, leading to material degradation over time. These vibrations result in cracks due to the material's lack of plasticity compromising structural integrity. Detecting such damage requires the installation of vibration sensors to capture the internal dynamics. However, distinguishing relevant eigenmodes from external noise necessitates the use of Deep Learning models. The detection of changes in eigenmodes can be used to anticipate these shifts in material properties and to discern between normal and anomalous structural behavior. Eigenmodes, representing characteristic vibration patterns, provide insights into structural dynamics and deviations from expected states. Thus, we propose ModeConv to automatically capture and analyze changes in eigenmodes, facilitating effective anomaly detection in structures and material properties. In the conducted experiments, ModeConv demonstrates computational efficiency improvements, resulting in reduced runtime for model calculations. The novel ModeConv neural network layer is tailored for temporal graph neural networks, in which every node represents one sensor. ModeConv employs a singular value decomposition based convolutional filter design for complex numbers and leverages modal transformation in lieu of Fourier or Laplace transformations in spectral graph convolutions. We include a mathematical complexity analysis illustrating the runtime reduction.


High dimensional analysis reveals conservative sharpening and a stochastic edge of stability

Agarwala, Atish, Pennington, Jeffrey

arXiv.org Artificial Intelligence

Recent empirical and theoretical work has shown that the dynamics of the large eigenvalues of the training loss Hessian have some remarkably robust features across models and datasets in the full batch regime. There is often an early period of progressive sharpening where the large eigenvalues increase, followed by stabilization at a predictable value known as the edge of stability. Previous work showed that in the stochastic setting, the eigenvalues increase more slowly - a phenomenon we call conservative sharpening. We provide a theoretical analysis of a simple high-dimensional model which shows the origin of this slowdown. We also show that there is an alternative stochastic edge of stability which arises at small batch size that is sensitive to the trace of the Neural Tangent Kernel rather than the large Hessian eigenvalues. We conduct an experimental study which highlights the qualitative differences from the full batch phenomenology, and suggests that controlling the stochastic edge of stability can help optimization.


Why Shallow Networks Struggle with Approximating and Learning High Frequency: A Numerical Study

Zhang, Shijun, Zhao, Hongkai, Zhong, Yimin, Zhou, Haomin

arXiv.org Machine Learning

In this work, a comprehensive numerical study involving analysis and experiments shows why a two-layer neural network has difficulties handling high frequencies in approximation and learning when machine precision and computation cost are important factors in real practice. In particular, the following basic computational issues are investigated: (1) the minimal numerical error one can achieve given a finite machine precision, (2) the computation cost to achieve a given accuracy, and (3) stability with respect to perturbations. The key to the study is the conditioning of the representation and its learning dynamics. Explicit answers to the above questions with numerical verifications are presented.