R-Drop: RegularizedDropoutforNeuralNetworks

Neural Information Processing Systems 

In this paper,we introduce asimple yet more effectivealternativeto regularize the training inconsistencyinduced bydropout, named asR-Drop. Concretely,ineachmini-batch training, eachdata sample goes through the forward pass twice, and each pass isprocessed by adifferent sub model by randomly dropping out some hidden units.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found