On Quadratic Penalties in Elastic Weight Consolidation

Dec-11-2017–arXiv.org Machine Learning

There are situations in which we would like to train a neural network to perform a range of tasks. This is usually possible if we can train the network on all tasks simultane ously. The problem is harder if we would like to train the network sequentially, one task after anot her. The na ive approach of training a trained neural network on a new task via gradient descen t leads to a phenomenon known as catastrophic forgetting: the network's performance in previous ly learned tasks rapidly deteriorates as soon as we start training on a new task. Kirkpatrick et al. [2017] propose a novel algorithm, elastic weight co nslidation (EWC), to address this problem, while maintaining the simplicity of relying on backpropagat ion and stochastic gradient descent as the main algorithmic workhorses. The authors observe that catastrophic forgetting would not happen if the network's parameters were learnt in a Bayesian fa shion: instead of obtaining single estimate of parameters θ via gradient descent, we calculate the Bayesian posterior distribut ion p( θ D

artificial intelligence, machine learning, penalty, (16 more...)

arXiv.org Machine Learning

Dec-11-2017

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Statistical Learning > Gradient Descent (0.75)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found