On Convergence and Generalization of Dropout Training

Mianjy, Poorya, Arora, Raman

arXiv.org Machine Learning 

We study dropout in two-layer neural networks with rectified linear unit (ReLU) activations. Under mild overparametrization and assuming that the limiting kernel can separate the data distribution with a positive margin, we show that dropout training with logistic loss achieves $\epsilon$-suboptimality in test error in $O(1/\epsilon)$ iterations.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found