The idea is ridiculously simple (perhaps why it is effective?): randomly skip layers while training • /r/MachineLearning
The idea is ridiculously simple (perhaps why it is effective?): I don't understand the claim "Remember all the narratives we told about how depth learns hierarchical representations, and higher level representations -- those higher level representations don't seem to matter so much after all.". The net has over 100 layers!?! I imagine that this also works reasonably well in the RNN encoder in an encoder/decoder framework. I wonder if it also applies to generative RNNs.
Apr-3-2016, 16:15:28 GMT
- Technology: