Strong convexity-guided hyper-parameter optimization for flatter losses

Feb-7-2024–arXiv.org Artificial Intelligence

We propose a novel white-box approach to hyper-parameter optimization. Motivated by recent work establishing a relationship between flat minima and generalization, we first establish a relationship between the strong convexity of the loss and its flatness. Based on this, we seek to find hyper-parameter configurations that improve flatness by minimizing the strong convexity of the loss. By using the structure of the underlying neural network, we derive closed-form equations to approximate the strong convexity parameter, and attempt to find hyper-parameters that minimize it in a randomized fashion. Through experiments on 14 classification datasets, we show that our method achieves strong performance at a fraction of the runtime.

minima, optimization, strong convexity, (12 more...)

arXiv.org Artificial Intelligence

Feb-7-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - West Virginia (0.04)
    - North Carolina (0.04)
    - Massachusetts > Middlesex County
      - Cambridge (0.04)
    - California > Santa Clara County
      - Palo Alto (0.04)
  - Canada > Alberta
    - Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > India
  - Karnataka > Bengaluru (0.04)
- Africa > Middle East
  - Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning
    - Optimization (1.00)
    - Uncertainty > Bayesian Inference (0.46)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.68)