Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Nets

Jun-26-2019–arXiv.org Machine Learning

We introduce a family of complexity measures for the hypotheses of neural nets, based on a multilevel relative entropy. These complexity measures take into account the multilevel structure of neural nets, as opposed to the classical relative entropy (KL-divergence) term derived from PAC-Bayesian bounds [1] or mutual information bounds [2, 3]. We derive these complexity measures by combining the technique of chaining mutual information (CMI) [4], an algorithm-dependent extension of the classical chaining technique paired with the mutual information bound [2], with the multilevel architecture of neural nets. It is observed in this paper that if a neural net is regularized in a multilevel manner as defined in Section 4, then one can readily construct hierarchical coverings with controlled diameters for its hypothesis set, and exploit this to obtain new multi-scale and algorithm-dependent generalization bounds and, in turn, new regularizers and training algorithms. The effect of such multilevel regularizations on the representation ability of neural nets has also been recently studied in [5, 6] for the special case where layers are nearly-identity functions as for ResNets [7].

artificial intelligence, neural network, relative entropy, (14 more...)

arXiv.org Machine Learning

Jun-26-2019

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Statistical Learning > Gradient Descent (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found