Strong convexity-guided hyper-parameter optimization for flatter losses
Yedida, Rahul, Saha, Snehanshu
–arXiv.org Artificial Intelligence
We propose a novel white-box approach to hyper-parameter optimization. Motivated by recent work establishing a relationship between flat minima and generalization, we first establish a relationship between the strong convexity of the loss and its flatness. Based on this, we seek to find hyper-parameter configurations that improve flatness by minimizing the strong convexity of the loss. By using the structure of the underlying neural network, we derive closed-form equations to approximate the strong convexity parameter, and attempt to find hyper-parameters that minimize it in a randomized fashion. Through experiments on 14 classification datasets, we show that our method achieves strong performance at a fraction of the runtime.
arXiv.org Artificial Intelligence
Feb-7-2024
- Country:
- North America
- United States
- West Virginia (0.04)
- North Carolina (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Canada > Alberta
- United States
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > India
- Africa > Middle East
- Tunisia > Ben Arous Governorate > Ben Arous (0.04)
- North America
- Genre:
- Research Report > New Finding (0.46)