Partial local entropy and anisotropy in deep weight spaces

Sep-10-2020–arXiv.org Machine Learning

Recent studies on the weight space of deep neural networks [1, 2] have highlighted the existence of rare subdominant clusters of configurations which yield a high test accuracy. Although these clusters constitute a deviation from typicality, they are efficiently encountered by stochastic gradient descent (SGD) algorithms and correspond to wide valleys of suitable loss functions, such as cross entropy [3]. An analogous circumstance occurs in the context of constraint satisfaction problems, where the chase after clusters of solutions is improved when the loss function gets supplemented by a term that encourages a local high density of solutions [4]. In order to find the number of solutions contained in a vicinity of a specific weight configuration, one can define a local solution-counting functional, namely, a local entropy. Classification tasks performed by means of quantized neural networks (where the weights are discrete) can be interpreted as constraint satisfaction problems. There are however two reasons to generalize the concept of local entropy: First, classification problems are typically required to reach a high but not necessarily perfect accuracy; second, they are often approached with machines that have continuous weights.

neural network, regularization, weight space, (15 more...)

arXiv.org Machine Learning

Sep-10-2020

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Spain
  - Galicia > A Coruña Province > Santiago de Compostela (0.04)

Genre:
- Research Report > Experimental Study (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Constraint-Based Reasoning (1.00)
  - Machine Learning
    - Statistical Learning > Gradient Descent (0.57)
    - Neural Networks > Deep Learning (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found