Learning Feature Hierarchies with Centered Deep Boltzmann Machines

Montavon, Grégoire, Müller, Klaus-Robert

Mar-16-2012–arXiv.org Artificial Intelligence

Deep Boltzmann machines are in principle powerful models for extracting the hierarchical structure of data. Unfortunately, attempts to train layers jointly (without greedy layer-wise pretraining) have been largely unsuccessful. We propose a modification of the learning algorithm that initially recenters the output of the activation functions to zero. This modification leads to a better conditioned Hessian and thus makes learning easier. We test the algorithm on real data and demonstrate that our suggestion, the centered deep Boltzmann machine, learns a hierarchy of increasingly abstract representations and a better generative model of data.

artificial intelligence, boltzmann machine, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Mar-16-2012

arXiv.org PDF

Add feedback

Country:
- North America > Canada > Ontario > Toronto (0.14)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found