Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

Oct-9-2024, 19:19:38 GMT–Neural Information Processing Systems

The (gradient-based) bilevel programming framework is widely used in hyperparameter optimization and has achieved excellent performance empirically. Previous theoretical work mainly focuses on its optimization properties, while leaving the analysis on generalization largely open. This paper attempts to address the issue by presenting an expectation bound w.r.t. the validation set based on uniform stability. Our results can explain some mysterious behaviours of the bilevel programming in practice, for instance, overfitting to the validation set. We also present an expectation bound for the classical cross-validation algorithm.

bilevel programming, hyperparameter optimization, stability and generalization, (2 more...)

Neural Information Processing Systems

Oct-9-2024, 19:19:38 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report > New Finding (0.91)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)