Lookaround Optimizer: k steps around, 1 step average

Dec-25-2025, 11:45:24 GMT–Neural Information Processing Systems

Weight Average (WA) is an active research topic due to its simplicity in ensembling deep networks and the effectiveness in promoting generalization. Existing weight average approaches, however, are often carried out along only one training trajectory in a post-hoc manner (i.e., the weights are averaged after the entire training process is finished), which significantly degrades the diversity between networks and thus impairs the effectiveness. In this paper, inspired by weight average, we propose Lookaround, a straightforward yet effective SGD-based optimizer leading to flatter minima with better generalization.

lookaround optimizer, name change, step average, (6 more...)

Neural Information Processing Systems

Dec-25-2025, 11:45:24 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.56)