What can flatness teach us: understanding generalisation in Deep Neural Networks

Mar-29-2021, 20:05:11 GMT–#artificialintelligence

This is the third post in a series summarising work that seeks to provide a theory of generalisation in Deep Neural Networks (DNNs). Briefly, the first post summarises evidence that DNNs trained with stochastic optimisers (like SGD) find functions with probability proportional to their volume in parameter-space, and the second post argues that these high-volume functions are'simple', thus explaining why DNNs generalise. In the following, we summarise results in [1] which explain why the'flatness of the loss landscape' has been shown to correlate with generalisation -- a well-known result (see e.g. They provide substantial empirical evidence that this correlation is actually a combination of (1) a weak correlation between the local flatness and the volume of the surrounding function, and (2) a strong correlation between volume and generalisation. This combination produces a weak correlation between'flatness' and generalisation.

correlation, deep neural network, generalisation, (15 more...)

#artificialintelligence

Mar-29-2021, 20:05:11 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found