resnet
- Asia > Singapore (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Germany > Bavaria > Regensburg (0.04)
- (3 more...)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Texas (0.04)
- North America > United States > California > Santa Clara County > San Jose (0.04)
- Information Technology > Security & Privacy (0.66)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
[R1/R2] Infinite width assumption: the infinite width assumption is needed due to the technical detail that the norm
We thank reviewers for their valuable comments. We respond to the main concerns below. Similar to that in Zhang et al. [31], we chose 10k block ResNet to stress the We will rephrase L243 to better express this. Derivative of weights depend on this term due to the chain rule. We will make this explicit in the revised manuscript.
- North America > Dominican Republic (0.04)
- North America > United States > Iowa (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Deep Neural Networks as Iterated Function Systems and a Generalization Bound
Deep neural networks (DNNs) achieve remarkable performance on a wide range of tasks, yet their mathematical analysis remains fragmented: stability and generalization are typically studied in disparate frameworks and on a case-by-case basis. Architecturally, DNNs rely on the recursive application of parametrized functions, a mechanism that can be unstable and difficult to train, making stability a primary concern. Even when training succeeds, there are few rigorous results on how well such models generalize beyond the observed data, especially in the generative setting. In this work, we leverage the theory of stochastic Iterated Function Systems (IFS) and show that two important deep architectures can be viewed as, or canonically associated with, place-dependent IFS. This connection allows us to import results from random dynamical systems to (i) establish the existence and uniqueness of invariant measures under suitable contractivity assumptions, and (ii) derive a Wasserstein generalization bound for generative modeling. The bound naturally leads to a new training objective that directly controls the collage-type approximation error between the data distribution and its image under the learned transfer operator. We illustrate the theory on a controlled 2D example and empirically evaluate the proposed objective on standard image datasets (MNIST, CelebA, CIFAR-10).
- Research Report (0.40)
- Instructional Material (0.34)
- North America > United States (0.14)
- North America > Canada > Ontario (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Ukraine (0.04)
- Government (1.00)
- Information Technology > Security & Privacy (0.68)
Riemannian Residual Neural Networks
Recent methods in geometric deep learning have introduced various neural networks to operate over data that lie on Riemannian manifolds. Such networks are often necessary to learn well over graphs with a hierarchical structure or to learn over manifold-valued data encountered in the natural sciences. These networks are often inspired by and directly generalize standard Euclidean neural networks. However, extending Euclidean networks is difficult and has only been done for a select few manifolds. In this work, we examine the residual neural network (ResNet) and show how to extend this construction to general Riemannian manifolds in a geometrically principled manner. Originally introduced to help solve the vanishing gradient problem, ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks. We find that our Riemannian ResNets mirror these desirable properties: when compared to existing manifold neural networks designed to learn over hyperbolic space and the manifold of symmetric positive definite matrices, we outperform both kinds of networks in terms of relevant testing metrics and training dynamics.