Learning Neural Networks with Adaptive Regularization

Zhao, Han, Tsai, Yao-Hung Hubert, Salakhutdinov, Ruslan, Gordon, Geoffrey J.

Jul-14-2019–arXiv.org Machine Learning

Although deep neural networks have been widely applied in various domains [19, 25, 27], usually its parameters are learned via the principle of maximum likelihood, hence its success crucially hinges on the availability of large scale datasets. When training rich models on small datasets, explicit regularization techniques are crucial to alleviate overfitting. Previous works have explored various regularization [39] and data augmentation [19, 38] techniques to learn diversified representations. In this paper, we look into an alternative direction by proposing an adaptive and data-dependent regularization method to encourage neurons of the same layer to share statistical strength. The goal of our method is to prevent overfitting when training (large) networks on small dataset. Our key insight stems from the famous argument by Efron [8] in the literature of the empirical Bayes method: It is beneficial to learn from the experience of others. From an algorithmic perspective, we argue that the connection weights of neurons in the same layer (row/column vectors of the weight matrix) will be correlated with each other through the backpropagation learning. Hence, by learning the correlations of the weight matrix, a neuron can "borrow statistical strength" from other neurons in the same layer.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

Jul-14-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.66)
  - Machine Learning
    - Neural Networks > Deep Learning (0.66)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found