Goto

Collaborating Authors

 injectivity capacity


Deep ReLU networks -- injectivity capacity upper bounds

arXiv.org Machine Learning

We study deep ReLU feed forward neural networks (NN) and their injectivity abilities. The main focus is on \emph{precisely} determining the so-called injectivity capacity. For any given hidden layers architecture, it is defined as the minimal ratio between number of network's outputs and inputs which ensures unique recoverability of the input from a realizable output. A strong recent progress in precisely studying single ReLU layer injectivity properties is here moved to a deep network level. In particular, we develop a program that connects deep $l$-layer net injectivity to an $l$-extension of the $\ell_0$ spherical perceptrons, thereby massively generalizing an isomorphism between studying single layer injectivity and the capacity of the so-called (1-extension) $\ell_0$ spherical perceptrons discussed in [82]. \emph{Random duality theory} (RDT) based machinery is then created and utilized to statistically handle properties of the extended $\ell_0$ spherical perceptrons and implicitly of the deep ReLU NNs. A sizeable set of numerical evaluations is conducted as well to put the entire RDT machinery in practical use. From these we observe a rapidly decreasing tendency in needed layers' expansions, i.e., we observe a rapid \emph{expansion saturation effect}. Only $4$ layers of depth are sufficient to closely approach level of no needed expansion -- a result that fairly closely resembles observations made in practical experiments and that has so far remained completely untouchable by any of the existing mathematical methodologies.


Injectivity capacity of ReLU gates

arXiv.org Machine Learning

Over the last 15-20 years we have been witnessing a rapid development of machine learning (ML) and neural networks (NN) concepts. As the need for efficient processing and interpretation of large data sets is estimated to further grow in the years to come, many fundamental algorithmic and theoretical NN breakthroughs are to be expected. To be able to adequately address upcoming challenges an excellent understanding of the ultimate limits of the employed technologies is needed. We in this paper study a mathematical problem that is directly connected to a notion of network capacity which is an example of such a limit. Characterizing presence or absence of injectivity as a property of random functions is the mathematical problem of our interest here. The mere definition of the functional injectivity implies its critical role in studying inverse problems. Namely, well-or ill-posedness of these problems is in a direct correspondence with the associated injectivity. Recent utilization of neural networks in studying (nonlinear) inverse problems therefore critically relies on their injectivity properties (see, e.g., [6,11,15,16,19,31,36,38]). Consequently, injectivity as a purely mathematical object is in these contexts transformed into a practically rather important NN architectures feature.