qNBO: quasi-Newton Meets Bilevel Optimization

Fang, Sheng, Liu, Yong-Jin, Yao, Wei, Yu, Chengming, Zhang, Jin

Feb-3-2025–arXiv.org Artificial Intelligence

Bilevel optimization, addressing challenges in hierarchical learning tasks, has gained significant interest in machine learning. The practical implementation of the gradient descent method to bilevel optimization encounters computational hurdles, notably the computation of the exact lower-level solution and the inverse Hessian of the lower-level objective. Although these two aspects are inherently connected, existing methods typically handle them separately by solving the lowerlevel problem and a linear system for the inverse Hessian-vector product. In this paper, we introduce a general framework to address these computational challenges in a coordinated manner. Specifically, we leverage quasi-Newton algorithms to accelerate the resolution of the lower-level problem while efficiently approximating the inverse Hessian-vector product. Furthermore, by exploiting the superlinear convergence properties of BFGS, we establish the non-asymptotic convergence analysis of the BFGS adaptation within our framework. Numerical experiments demonstrate the comparable or superior performance of the proposed algorithms in real-world learning tasks, including hyperparameter optimization, data hypercleaning, and few-shot meta-learning. Bilevel optimization (BLO), which addresses challenges in hierarchical decision process, has gained significant interest in many real-world applications.

artificial intelligence, iteration, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Feb-3-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Fujian Province (0.14)
- North America > Canada
  - Ontario > Toronto (0.14)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.47)
    - Statistical Learning > Gradient Descent (0.48)
  - Representation & Reasoning > Optimization (0.92)