Uncertainty Quantification From Scaling Laws in Deep Neural Networks

Elsharkawy, Ibrahim, Kahn, Yonatan, Hooberman, Benjamin

Mar-7-2025–arXiv.org Artificial Intelligence

Deep learning techniques have improved performance beyond conventional methods in a wide variety of tasks. However, for neural networks in particular, it is not straightforward to assign network-induced uncertainty on their output as a function of network architecture, training algorithm, and initialization [1]. One approach to uncertainty quantification (UQ) is to treat any individual network as a draw from an ensemble, and identify the systematic uncertainty with the variance in the neural network outputs over the ensemble [2, 3]. This variance can certainly be measured empirically by training a large ensemble of networks, but it would be advantageous to be able to predict it from first principles. This is possible in the infinite-width limit of multi-layer perceptron (MLP) architectures, where the statistics of the network outputs after training are Gaussian with mean and variance determined by the neural tangent kernel (NTK) [4-6]. For realistic MLPs with large but finite width n, one can compute corrections to this Gaussian distribution that are perturbative in 1/n [7].

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

Mar-7-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada > Ontario
    - Toronto (0.14)
  - United States > Illinois
    - Champaign County > Urbana (0.14)

Genre:
- Research Report (1.00)

Industry:
- Government > Regional Government > North America Government (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)