A PAC-Bayes oracle inequality for sparse neural networks

Oct-2-2023–arXiv.org Machine Learning

Driven by the enormous success of neural networks in a broad spectrum of machine learning applications, see Goodfellow et al. [16] and Schmidhuber [29] for an introduction, the theoretical understanding of network based methods is a dynamic and flourishing research area at the intersection of mathematical statistics, optimization and approximation theory. In addition to theoretical guarantees, uncertainty quantification is an important and challenging problem for neural networks and has motivated the introduction of Bayesian neural networks, where for each network weight a distribution is learned, see Graves [17] and Blundell et al. [8] and numerous subsequent articles. In this work we study the Gibbs posterior distribution for a stochastic neural network. In a nonparametric regression problem, we show that the corresponding estimator achieves a minimax-optimal prediction risk bound up to a logarithmic factor. Moreover, the method is adaptive with respect to the unknown regularity and structure of the regression function. While early theoretical foundations for neural nets are summarized by Anthony & Bartlett [4], the excellent approximation properties of deep neural nets, especially with the ReLU activation function, have been discovered in the last years, see e.g.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

Oct-2-2023

arXiv.org PDF

Add feedback

Country:
- Europe (0.14)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.46)
    - Neural Networks > Deep Learning (0.49)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found