Norm matters: efficient and accurate normalization schemes in deep networks

Elad Hoffer, Ron Banner, Itay Golan, Daniel Soudry

Neural Information Processing Systems 

Over the past few years, Batch-Normalization has been commonly used in deep networks, allowing faster training and high performance for a wide variety of applications. However, the reasons behind its merits remained unanswered, with several shortcomings that hindered its use for certain tasks. In this work, we present a novel view on the purpose and function of normalization methods and weightdecay, as tools to decouple weights' norm from the underlying optimized objective. This property highlights the connection between practices such as normalization, weight decay and learning-rate adjustments.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found