Volume-preserving Neural Networks: A Solution to the Vanishing Gradient Problem

MacDonald, Gordon, Godbout, Andrew, Gillcash, Bryn, Cairns, Stephanie

Nov-22-2019–arXiv.org Machine Learning

Department of Mathematics and Statistics McGill University Montreal, QC H3A 0E9 Canada Editor: Abstract We propose a novel approach to addressing the vanishing (or exploding) gradient problem in deep neural networks. We construct a new architecture for deep neural networks where all layers (except the output layer) of the network are a combination of rotation, permutation, diagonal, and activation sublayers which are all volume preserving. This control on the volume forces the gradient (on average) to maintain equilibrium and not explode or vanish. Volume-preserving neural networks train reliably, quickly and accurately and the learning rate is consistent across layers in deep volume-preserving neural networks. To demonstrate this we apply our volume-preserving neural network model to two standard datasets. Keywords: volume-preserving, neural network, machine learning, deep learning, vanishing gradient problem 1. Introduction Deep neural networks are characterized by the composition of a large number of functions (aka layers), each typically consisting of an affine transformation followed by a non-affine "activation function". Each layer is determined by a number of parameters which are trained on data to approximate some function. The deepness refers to the number of such functions composed (or the number of layers). The number of layers required to be deep is not well-defined, but an overview of deep learning (Schmidhuber, 2015) states that any 1 arXiv:1911.09576v2

activation function, neural network, vpnn, (16 more...)

arXiv.org Machine Learning

Nov-22-2019

arXiv.org PDF

Add feedback

Country:
- North America > Canada
  - Quebec > Montreal (0.54)
  - Prince Edward Island > Queens County
    - Charlottetown (0.04)

Genre:
- Research Report (0.84)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found