Goto

Collaborating Authors

 Uncertainty


Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics

Neural Information Processing Systems

We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error. Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence. This potentially makes it difficult to directly deal with finite width network; especially in the neural tangent kernel regime, we cannot reveal favorable properties of neural networks beyond kernel methods. To realize more natural analysis, we consider a completely different approach in which we formulate the parameter training as a transportation map estimation and show its global convergence via the theory of the infinite dimensional Langevin dynamics . This enables us to analyze narrow and wide networks in a unifying manner. Moreover, we give generalization gap and excess risk bounds for the solution obtained by the dynamics. The excess risk bound achieves the so-called fast learning rate. In particular, we show an exponential convergence for a classification problem and a minimax optimal rate for a regression problem.









Efficient Online Estimation of Causal Effects by Deciding What to Observe

Neural Information Processing Systems

However, this perspective fails to address the difficult data collection decisions that precede such modeling efforts. Doctors must select a set of tests to run. Survey designers must select a slate of questions to ask. Companies must select which datasets to purchase.