Uncertainty Quantification From Scaling Laws in Deep Neural Networks
Elsharkawy, Ibrahim, Kahn, Yonatan, Hooberman, Benjamin
–arXiv.org Artificial Intelligence
Deep learning techniques have improved performance beyond conventional methods in a wide variety of tasks. However, for neural networks in particular, it is not straightforward to assign network-induced uncertainty on their output as a function of network architecture, training algorithm, and initialization [1]. One approach to uncertainty quantification (UQ) is to treat any individual network as a draw from an ensemble, and identify the systematic uncertainty with the variance in the neural network outputs over the ensemble [2, 3]. This variance can certainly be measured empirically by training a large ensemble of networks, but it would be advantageous to be able to predict it from first principles. This is possible in the infinite-width limit of multi-layer perceptron (MLP) architectures, where the statistics of the network outputs after training are Gaussian with mean and variance determined by the neural tangent kernel (NTK) [4-6]. For realistic MLPs with large but finite width n, one can compute corrections to this Gaussian distribution that are perturbative in 1/n [7].
arXiv.org Artificial Intelligence
Mar-7-2025
- Country:
- North America
- Canada > Ontario
- Toronto (0.14)
- United States > Illinois
- Champaign County > Urbana (0.14)
- Canada > Ontario
- North America
- Genre:
- Research Report (1.00)
- Industry:
- Technology: