Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Nets

Asadi, Amir R., Abbe, Emmanuel

arXiv.org Machine Learning 

We introduce a family of complexity measures for the hypotheses of neural nets, based on a multilevel relative entropy. These complexity measures take into account the multilevel structure of neural nets, as opposed to the classical relative entropy (KL-divergence) term derived from PAC-Bayesian bounds [1] or mutual information bounds [2, 3]. We derive these complexity measures by combining the technique of chaining mutual information (CMI) [4], an algorithm-dependent extension of the classical chaining technique paired with the mutual information bound [2], with the multilevel architecture of neural nets. It is observed in this paper that if a neural net is regularized in a multilevel manner as defined in Section 4, then one can readily construct hierarchical coverings with controlled diameters for its hypothesis set, and exploit this to obtain new multi-scale and algorithm-dependent generalization bounds and, in turn, new regularizers and training algorithms. The effect of such multilevel regularizations on the representation ability of neural nets has also been recently studied in [5, 6] for the special case where layers are nearly-identity functions as for ResNets [7].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found