Hessian-based Analysis of Large Batch Training and Robustness to Adversaries

Zhewei Yao, Amir Gholami, Qi Lei, Kurt Keutzer, Michael W. Mahoney

Nov-20-2025, 14:26:42 GMT–Neural Information Processing Systems

Extensive experiments on multiple networks show that saddle-points are not the cause for generalization gap of large batch size training, and the results consistently show that large batch converges to points with noticeably higher Hessian spectrum.

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Nov-20-2025, 14:26:42 GMT

Conferences PDF

Add feedback

Country:
- North America
  - United States
    - California (0.04)
    - Texas > Travis County
      - Austin (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (0.69)

Industry:
- Information Technology > Security & Privacy (0.30)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.68)
  - Machine Learning
    - Statistical Learning (0.93)
    - Neural Networks > Deep Learning (0.47)

Duplicate Docs Excel Report

Title
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries

Similar Docs Excel Report more

Title	Similarity	Source
None found