Practical Riemannian Neural Networks

Feb-25-2016–arXiv.org Machine Learning

We provide the first experimental results on nonsynthetic datasets for the quasidiagonal Riemannian gradient descents for neural networks introduced in [Oll15]. These include the MNIST, SVHN, and FACE datasets as well as a previously unpublished electroencephalogram dataset. The quasi-diagonal Riemannian algorithms consistently beat simple stochastic gradient gradient descents by a varying margin. The computational overhead with respect to simple backpropagation is around a factor 2. Perhaps more interestingly, these methods also reach their final performance quickly, thus requiring fewer training epochs and a smaller total computation time. We also present an implementation guide to these Riemannian gradient descents for neural networks, showing how the quasi-diagonal versions can be implemented with minimal effort on top of existing routines which compute gradients. We present a practical and efficient implementation of invariant stochastic gradient descent algorithms for neural networks based on the quasi-diagonal Riemannian metrics introduced in [Oll15]. These can be implemented from the same data as RMSProp-or AdaGrad-based schemes [DHS11], namely, by collecting gradients and squared gradients for each data sample. Thus we will try to present them in a way that can easily be incorporated on top of existing software providing gradients for neural networks.

artificial intelligence, gradient, machine learning, (18 more...)

arXiv.org Machine Learning

Feb-25-2016

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning > Gradient Descent (1.00)
  - Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found