Supplementary Information for: The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective
–Neural Information Processing Systems
In this section we discuss various Deep GP facts presented throughout the main paper. Here we formalize the claim that Deep GP have "infinite capacity." To show that the neural network defined in Eq. (1) is a (degenerate) Deep GP, we must show that each of its layers corresponds to a (degenerate, vector-valued) Gaussian process. GP, their covariance functions only correspond to a finite basis and therefore they do not have the same properties as nonparametric Deep GP (i.e. the ability to model any function to arbitrary precision). We also note that Deep GP and (Bayesian) neural networks can both be generalized to other hierarchical models, such as Deep Kernel Processes [3].
Neural Information Processing Systems
Oct-2-2025, 16:26:48 GMT