Hessian-based Analysis of Large Batch Training and Robustness to Adversaries
Zhewei Yao, Amir Gholami, Qi Lei, Kurt Keutzer, Michael W. Mahoney
–Neural Information Processing Systems
Extensive experiments on multiple networks show that saddle-points are not the cause for generalization gap of large batch size training, and the results consistently show that large batch converges to points with noticeably higher Hessian spectrum.
Neural Information Processing Systems
Nov-20-2025, 14:26:42 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- California (0.04)
- Texas > Travis County
- Austin (0.04)
- Canada > Quebec
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.69)
- Industry:
- Information Technology > Security & Privacy (0.30)
- Technology: