Reviews: The Effect of Network Width on the Performance of Large-batch Training

Oct-8-2024, 08:26:33 GMT–Neural Information Processing Systems

It has been wide discussed on how to develop algorithms allow large batches, so that one could train neural networks in a distributed environment. The paper investigates the effect of network width on the performance of large-batch training both theoretically and experimentally. The authors claim that with the same number of parameters, it is more likely to train neural networks using proper large batches easily with a wide network architecture. The theoretical support on 2-layers linear/nonlinear networks and multilayer linear networks is also given. The paper is well-written and easy to follow.

large-batch training, neural network, performance, (10 more...)

Neural Information Processing Systems

Oct-8-2024, 08:26:33 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)