Reviews: Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
–Neural Information Processing Systems
The paper investigates the problem of expressiveness in neural networks w.r.t. The authors also show an upper bound for classification, a corollary of which is that a three hidden layer network with hidden layers of sized 2k-2k-4k can perfectly classify ImageNet. Moreover, they show that if the overall sum of hidden nodes in a ResNet is of order N/d_x, where d_x is the input dimension then again the network can perfectly realize the data. Lastly, an analysis is given showing batch SGD that is initialized close to a global minimum will come close to a point with value significantly smaller than the loss in the initialization (though a convergence guarantee could not be given). The paper is clear and easy to follow for the most part, and conveys a feeling that the authors did their best to make the analysis as thorough and exhausting as possible, providing results for various settings.
Neural Information Processing Systems
Jan-27-2025, 12:50:16 GMT
- Technology: