Greedy Layer-Wise Training of Deep Networks

Bengio, Yoshua, Lamblin, Pascal, Popovici, Dan, Larochelle, Hugo

Neural Information Processing Systems 

Complexity theory of circuits strongly suggests that deep architectures can be much more efficient (sometimes exponentially) than shallow architectures, in terms of computational elements required to represent some functions. Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly nonlinear and highly-varying functions. However, until recently it was not clear how to train such deep networks, since gradient-based optimization starting from random initialization appears to often get stuck in poor solutions.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found