Condensed Stein Variational Gradient Descent for Uncertainty Quantification of Neural Networks
Padmanabha, Govinda Anantha, Safta, Cosmin, Bouklas, Nikolaos, Jones, Reese E.
In the context of uncertainty quantification (UQ) the curse of dimensionality, whereby quantification efficiency degrades drastistically with parameter dimension, is particular extreme with highly parameterized models such as neural networks (NNs). Fortunately, in many cases, these models are overparameterized in the sense that the number of parameters can be reduced with negligible effects on accuracy and sometimes improvements in generalization [1]. Furthermore, NNs often have parameterizations that have fungible parameters such that permutations of the values and connections lead to equivalent output responses. This suggests methods that simultaneously sparsify and characterize the uncertainty of a model, while handling and taking advantage of the symmetries inherent in the model, are potentially advantageous approaches. Although Markov chain Monte Carlo (MCMC) methods [2] have been the reference standard to generate samples for UQ methods, they can be temperamental and do not scale well for high dimensional models. More recently, there has been widespread use of variational inference methods (VI), which cast the parameter posterior sampling problem as an optimization of a surrogate posterior guided by a suitable objective, such as the Kullback-Liebler (KL) divergence between the predictive posterior and true posterior induced by the data. In particular, there is now a family of model ensemble methods based on Stein's identity [3], such as Stein variational gradient descent (SVGD) [4], projected SVGD [5], and Stein variational Newton's method [6]. These methods have advantages over MCMC methods by virtue of propagating in parallel a coordinated ensemble of particles that represent the empirical posterior.
Dec-20-2024
- Country:
- North America > United States > New York (0.28)
- Genre:
- Research Report (0.64)
- Industry:
- Technology: