Understanding Deep Learning Requires Re-thinking Generalization

@machinelearnbot 

Understanding deep learning requires re-thinking generalization Zhang et al., ICLR'17 This paper has a wonderful combination of properties: the results are easy to understand, somewhat surprising, and then leave you pondering over what it all might mean for a long while afterwards! What is it that distinguishes neural networks that generalize well from those that don't? A satisfying answer to this question would not only help to make neural networks more interpretable, but it might also lead to more principled and reliable model architecture design. By "generalize well," the authors simply mean "what causes a network that performs well on training data to also perform well on the (held out) test data?" If you think about that for a moment, the question pretty much boils down to "why do neural networks work as well as they do?"