50 things I learned at NIPS 2016

#artificialintelligence 

Why does deep learning work now, but not 20 years ago, even though many of the core ideas were there? In one sentence: We have more data, more compute, better software engineering, and a few algorithmic innovations (many layers, ReLUs, better initialization and learning rates, dropout, LSTMs). But why does gradient-based optimization work at all in neural nets despite the non-convexity? One possible, partial answer is overprovisioning: There are generally many hidden units, and there are many ways a neural net can approximately implement the desired input-output relationship. You only need to find one.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found