Goto

Collaborating Authors

 cumulant


The monotonicity of the Franz-Parisi potential is equivalent with Low-degree MMSE lower bounds

Tsirkas, Konstantinos, Wang, Leda, Zadik, Ilias

arXiv.org Machine Learning

Over the last decades, two distinct approaches have been instrumental to our understanding of the computational complexity of statistical estimation. The statistical physics literature predicts algorithmic hardness through local stability and monotonicity properties of the Franz--Parisi (FP) potential \cite{franz1995recipes,franz1997phase}, while the mathematically rigorous literature characterizes hardness via the limitations of restricted algorithmic classes, most notably low-degree polynomial estimators \cite{hopkins2017efficient}. For many inference models, these two perspectives yield strikingly consistent predictions, giving rise to a long-standing open problem of establishing a precise mathematical relationship between them. In this work, we show that for estimation problems the power of low-degree polynomials is equivalent to the monotonicity of the annealed FP potential for a broad family of Gaussian additive models (GAMs) with signal-to-noise ratio $λ$. In particular, subject to a low-degree conjecture for GAMs, our results imply that the polynomial-time limits of these models are directly implied by the monotonicity of the annealed FP potential, in conceptual agreement with predictions from the physics literature dating back to the 1990s.








cf9dc5e4e194fc21f397b4cac9cc3ae9-Paper.pdf

Neural Information Processing Systems

However, the structure of their hidden layer representations is only theoretically well-understood incertain infinite-width limits, inwhichtheserepresentations cannot flexibly adapt tolearn data-dependent features [3-11,24]. Inthe Bayesian setting, these representations are described by fixed, deterministic kernels [3-11].



Aself-consistenttheoryofGaussianProcesses capturesfeaturelearningeffectsinfiniteCNNs

Neural Information Processing Systems

Despite its theoretical appeal, this viewpoint lacks a crucial ingredient of deep learning in finite DNNs, laying at the heart of their success --feature learning. Here we consider DNNs trained with noisy gradient descent on a large training set and derive a self-consistent Gaussian Process theory accounting forstrongfinite-DNN and feature learning effects.