Accelerating Quadratic Optimization Up to 3x With Reinforcement Learning
First-order methods for solving quadratic programs (QPs) are widely used for rapid, multiple-problem solving and embedded optimal control in large-scale machine learning. The problem is, these approaches typically require thousands of iterations, which makes them unsuitable for real-time control applications that have tight latency constraints. To address this issue, a research team from the University of California, Princeton University and ETH Zurich has proposed RLQP, an accelerated QP solver based on operator-splitting QP (OSQP) that uses deep reinforcement learning (RL) to compute a policy that adapts the internal parameters of a first-order quadratic program (QP) solver to speed up the solver's convergence rate. The team performed their speed-up on the OSQP solver, which solves QPs using a first-order alternating direction method of multipliers (ADMM), an efficient first-order optimization algorithm. The RLQP strives to learn a policy to adapt the internal parameters of the ADMM algorithm between iterations in order to minimize solve times.
Jul-30-2021, 14:46:34 GMT