Goto

Collaborating Authors

 functional dimension


On Functional Dimension and Persistent Pseudodimension

Grigsby, J. Elisenda, Lindsey, Kathryn

arXiv.org Artificial Intelligence

For any fixed feedforward ReLU neural network architecture, it is well-known that many different parameter settings can determine the same function. It is less well-known that the degree of this redundancy is inhomogeneous across parameter space. In this work, we discuss two locally applicable complexity measures for ReLU network classes and what we know about the relationship between them: (1) the local functional dimension [14, 18], and (2) a local version of VC dimension that we call persistent pseudodimension. The former is easy to compute on finite batches of points; the latter should give local bounds on the generalization gap, which would inform an understanding of the mechanics of the double descent phenomenon [7].


Geometry-induced Implicit Regularization in Deep ReLU Neural Networks

Bona-Pellissier, Joachim, Malgouyres, Fran çois, Bachoc, Fran çois

arXiv.org Artificial Intelligence

It is well known that neural networks with many more parameters than training examples do not overfit. Implicit regularization phenomena, which are still not well understood, occur during optimization and 'good' networks are favored. Thus the number of parameters is not an adequate measure of complexity if we do not consider all possible networks but only the 'good' ones. To better understand which networks are favored during optimization, we study the geometry of the output set as parameters vary. When the inputs are fixed, we prove that the dimension of this set changes and that the local dimension, called batch functional dimension, is almost surely determined by the activation patterns in the hidden layers. We prove that the batch functional dimension is invariant to the symmetries of the network parameterization: neuron permutations and positive rescalings. Empirically, we establish that the batch functional dimension decreases during optimization. As a consequence, optimization leads to parameters with low batch functional dimensions. We call this phenomenon geometry-induced implicit regularization.The batch functional dimension depends on both the network parameters and inputs. To understand the impact of the inputs, we study, for fixed parameters, the largest attainable batch functional dimension when the inputs vary. We prove that this quantity, called computable full functional dimension, is also invariant to the symmetries of the network's parameterization, and is determined by the achievable activation patterns. We also provide a sampling theorem, showing a fast convergence of the estimation of the computable full functional dimension for a random input of increasing size. Empirically we find that the computable full functional dimension remains close to the number of parameters, which is related to the notion of local identifiability. This differs from the observed values for the batch functional dimension computed on training inputs and test inputs. The latter are influenced by geometry-induced implicit regularization.


Hidden symmetries of ReLU networks

Grigsby, J. Elisenda, Lindsey, Kathryn, Rolnick, David

arXiv.org Artificial Intelligence

The parameter space for any fixed architecture of feedforward ReLU neural networks serves as a proxy during training for the associated class of functions - but how faithful is this representation? It is known that many different parameter settings can determine the same function. Moreover, the degree of this redundancy is inhomogeneous: for some networks, the only symmetries are permutation of neurons in a layer and positive scaling of parameters at a neuron, while other networks admit additional hidden symmetries. In this work, we prove that, for any network architecture where no layer is narrower than the input, there exist parameter settings with no hidden symmetries. We also describe a number of mechanisms through which hidden symmetries can arise, and empirically approximate the functional dimension of different network architectures at initialization. These experiments indicate that the probability that a network has no hidden symmetries decreases towards 0 as depth increases, while increasing towards 1 as width and input dimension increase.


Functional dimension of feedforward ReLU neural networks

Grigsby, J. Elisenda, Lindsey, Kathryn, Meyerhoff, Robert, Wu, Chenxi

arXiv.org Artificial Intelligence

It is well-known that the parameterized family of functions representable by fully-connected feedforward neural networks with ReLU activation function is precisely the class of piecewise linear functions with finitely many pieces. It is less well-known that for every fixed architecture of ReLU neural network, the parameter space admits positive-dimensional spaces of symmetries, and hence the local functional dimension near any given parameter is lower than the parametric dimension. In this work we carefully define the notion of functional dimension, show that it is inhomogeneous across the parameter space of ReLU neural network functions, and continue an investigation - initiated in [14] and [5] - into when the functional dimension achieves its theoretical maximum. We also study the quotient space and fibers of the realization map from parameter space to function space, supplying examples of fibers that are disconnected, fibers upon which functional dimension is non-constant, and fibers upon which the symmetry group acts non-transitively.