[D] Output variance of a deep CNN vanishes during training. • r/MachineLearning

#artificialintelligence 

I am working with a 20 layer deep CNN whose output is a softmax over 3 classes. When I use a depth of 32 for all conv layer, i observe a smooth convergence to the expected output. However, when i only change the depth of all conv layers to 64, i observe the following: After initialization a reasonable amount of variance in the outputs for different inputs is present. Then, during training the variance gradually vanishes, until it seems that only bias is learned. Apparently, the gradient w.r.t to the conv weights vanishes over time.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found