On Convergence and Generalization of Dropout Training

Jan-15-2025, 11:41:47 GMT–Neural Information Processing Systems

We study dropout in two-layer neural networks with rectified linear unit (ReLU) activations. Under mild overparametrization and assuming that the limiting kernel can separate the data distribution with a positive margin, we show that the dropout training with logistic loss achieves \epsilon -suboptimality in the test error in O(1/\epsilon) iterations.

convergence and generalization, dropout training

Neural Information Processing Systems

Jan-15-2025, 11:41:47 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.38)