Goto

Collaborating Authors

 Education





Adaptive Methods for Nonconvex Optimization

Neural Information Processing Systems

Equal Contribution 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montrรฉal, Canada. is often attributed to the rapid decay in the learning rate when gradients are dense, which is often the case in many machine learning applications.



Stochastic Composite Mirror Descent: Optimal Bounds with High Probabilities

Neural Information Processing Systems

Although much theoretical analysis has been performed to understand the practical behavior of SGD and SCMD, the existing theoretical results are still not quite satisfactory. Firstly, most of the existing theoretical results are stated in expectation which inevitably ignore some information on high-order moments of the random variable we are interested in.



The committee machine: Computational to statistical gaps in learning a two-layers neural network

Neural Information Processing Systems

Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks. In this contribution, we provide a rigorous justification of these approaches for a two-layers neural network model called the committee machine.