Beyond Backprop: Alternating Minimization with co-Activation Memory

Choromanska, Anna, Kumaravel, Sadhana, Luss, Ronny, Rish, Irina, Kingsbury, Brian, Tejwani, Ravi, Bouneffouf, Djallel

Jun-23-2018–arXiv.org Machine Learning

We propose a novel online algorithm for training deep feedforward neural networks that employs alternating minimization (block-coordinate descent) between the weights and activation variables. It extends off-line alternating minimization approaches to online, continual learning, and improves over stochastic gradient descent (SGD) with backpropagation in several ways: it avoids the vanishing gradient issue, it allows for non-differentiable nonlinearities, and it permits parallel weight updates across the layers. Unlike SGD, our approach employs co-activation memory inspired by the online sparse coding algorithm of [Mairal et al, 2009]. Furthermore, local iterative optimization with explicit activation updates is a potentially more biologically plausible learning mechanism than backpropagation. We provide theoretical convergence analysis and promising empirical results on several datasets.

algorithm, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

Jun-23-2018

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Industry:
- Education > Educational Setting (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found