Asymptotics of representation learning in finite Bayesian neural networks

Zavatone-Veth, Jacob A., Canatar, Abdulkadir, Pehlevan, Cengiz

Jun-11-2021–arXiv.org Machine Learning

Recent works have suggested that finite Bayesian neural networks may outperform their infinite cousins because finite networks can flexibly adapt their internal representations. However, our theoretical understanding of how the learned hidden layer representations of finite networks differ from the fixed representations of infinite networks remains incomplete. Perturbative finite-width corrections to the network prior and posterior have been studied, but the asymptotics of learned features have not been fully characterized. Here, we argue that the leading finite-width corrections to the average feature kernels for any Bayesian network with linear readout and quadratic cost have a largely universal form. We illustrate this explicitly for two classes of fully connected networks: deep linear networks and networks with a single nonlinear hidden layer. Our results begin to elucidate which features of data wide Bayesian neural networks learn to represent.

correction, kernel, linear network, (16 more...)

arXiv.org Machine Learning

Jun-11-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New Jersey > Mercer County
    - Princeton (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.34)
  - Machine Learning
    - Neural Networks > Deep Learning (0.46)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found