Goto

Collaborating Authors

 transient chaos


Exponential expressivity in deep neural networks through transient chaos

Neural Information Processing Systems

We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in deep neural networks with random weights. Our results reveal a phase transition in the expressivity of random deep networks, with networks in the chaotic phase computing nonlinear functions whose global curvature grows exponentially with depth, but not with width. We prove that this generic class of random functions cannot be efficiently computed by any shallow network, going beyond prior work that restricts their analysis to single functions. Moreover, we formally quantify and demonstrate the long conjectured idea that deep networks can disentangle exponentially curved manifolds in input space into flat manifolds in hidden space. Our theoretical framework for analyzing the expressive power of deep networks is broadly applicable and provides a basis for quantifying previously abstract notions about the geometry of deep functions.


Are Statistical Methods Obsolete in the Era of Deep Learning?

Wu, Skyler, Yang, Shihao, Kou, S. C.

arXiv.org Machine Learning

The advancement of deep neural network models in the last fifteen years has profoundly altered the scientific landscape of estimation, prediction and decision making, from the early success of image recognition (Krizhevsky et al., 2012; He et al., 2016), to the success of self-learning of board games (Silver et al., 2017), to machine translation (Wu et al., 2016), to generative AI (Ho et al., 2020), and to the success of protein structure prediction (Jumper et al., 2021), among many other developments. In many of these successes, there are no well-established mechanistic models to describe the underlying problem (for example, we do not fully understand how human brains translate from one language to another). As such, it is conceivable that such successes are attributable to deep neural networks' remarkable capabilities for universal function approximation. In contrast, the hand-crafted models that existed before deep neural networks (such as n-gram models (Katz, 1987; Brown et al., 1992; Bengio et al., 2000)) were too restricted to offer satisfactory approximation. How well do deep neural network models work when there are well-established mechanistic models (as in physical sciences, where decades of theoretical and experimental endeavor have yielded highly accurate mechanistic models in many cases) -- in particular, how do the inference and prediction results of deep neural network models compare to more statistical approaches in the presence of reliable mechanistic models -- is an interesting question.


Reviews: Exponential expressivity in deep neural networks through transient chaos

Neural Information Processing Systems

This is a very interesting work. However, I have a few major concerns: 1) I believe Theorem 1 is wrong, as can be seen from the counterexample at the bottom of this review. As can be observed from this counterexample, the main problem in the proof is the inaccurate sentence on lines 110-112 in the supplementary material. I'll wait to author's feedback before deciding if this a fatal flaw. In this case, h_i l are all composed of different linear sums of the same random vector x {l-1}, and are therefore dependent.


Extracting Signal out of Chaos: Advancements on MAGI for Bayesian Analysis of Dynamical Systems

Wu, Skyler

arXiv.org Machine Learning

This work builds off the manifold-constrained Gaussian process inference (MAGI) method for Bayesian parameter inference and trajectory reconstruction of ODE-based dynamical systems, focusing primarily on sparse and noisy data conditions. First, we introduce Pilot MAGI (pMAGI), a novel methodological upgrade on the base MAGI method that confers significantly-improved numerical stability, parameter inference, and trajectory reconstruction. Second, we demonstrate, for the first time to our knowledge, how one can combine MAGI-based methods with dynamical systems theory to provide probabilistic classifications of whether a system is stable or chaotic. Third, we demonstrate how pMAGI performs favorably in many settings against much more computationally-expensive and overparameterized methods. Fourth, we introduce Pilot MAGI Sequential Prediction (PMSP), a novel method building upon pMAGI that allows one to predict the trajectory of ODE-based dynamical systems multiple time steps into the future, given only sparse and noisy observations. We show that PMSP can output accurate future predictions even on chaotic dynamical systems and significantly outperform PINN-based methods. Overall, we contribute to the literature two novel methods, pMAGI and PMSP, that serve as Bayesian, uncertainty-quantified competitors to the Physics-Informed Neural Network.


Machine learning prediction of critical transition and system collapse

Kong, Ling-Wei, Fan, Hua-Wei, Grebogi, Celso, Lai, Ying-Cheng

arXiv.org Artificial Intelligence

To predict a critical transition due to parameter drift without relying on model is an outstanding problem in nonlinear dynamics and applied fields. A closely related problem is to predict whether the system is already in or if the system will be in a transient state preceding its collapse. We develop a model free, machine learning based solution to both problems by exploiting reservoir computing to incorporate a parameter input channel. We demonstrate that, when the machine is trained in the normal functioning regime with a chaotic attractor (i.e., before the critical transition), the transition point can be predicted accurately. Remarkably, for a parameter drift through the critical point, the machine with the input parameter channel is able to predict not only that the system will be in a transient state, but also the average transient time before the final collapse.


Exponential expressivity in deep neural networks through transient chaos

Poole, Ben, Lahiri, Subhaneil, Raghu, Maithra, Sohl-Dickstein, Jascha, Ganguli, Surya

Neural Information Processing Systems

We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in deep neural networks with random weights. Our results reveal a phase transition in the expressivity of random deep networks, with networks in the chaotic phase computing nonlinear functions whose global curvature grows exponentially with depth, but not with width. We prove that this generic class of random functions cannot be efficiently computed by any shallow network, going beyond prior work that restricts their analysis to single functions. Moreover, we formally quantify and demonstrate the long conjectured idea that deep networks can disentangle exponentially curved manifolds in input space into flat manifolds in hidden space. Our theoretical framework for analyzing the expressive power of deep networks is broadly applicable and provides a basis for quantifying previously abstract notions about the geometry of deep functions.


[PDF] Exponential expressivity in deep neural networks through transient chaos - Semantic Scholar

#artificialintelligence

We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in deep neural networks with random weights. Our results reveal a phase transition in the expressivity of random deep networks, with networks in the chaotic phase computing nonlinear functions whose global curvature grows exponentially with depth, but not with width.