Accelerating Quadratic Optimization with Reinforcement Learning
–Neural Information Processing Systems
First-order methods for quadratic optimization such as OSQP are widely used for large-scale machine learning and embedded optimal control, where many related problems must be rapidly solved. These methods face two persistent challenges: manual hyperparameter tuning and convergence time to high-accuracy solutions. To address these, we explore how Reinforcement Learning (RL) can learn a policy to tune parameters to accelerate convergence. In experiments with well-known QP benchmarks we find that our RL policy, RLQP, significantly outperforms state-ofthe-art QP solvers by up to 3x.
Neural Information Processing Systems
Feb-10-2025, 06:30:52 GMT