Neural Sum-of-Squares: Certifying the Nonnegativity of Polynomials with Transformers

Pelleriti, Nico, Spiegel, Christoph, Liu, Shiwei, Martínez-Rubio, David, Zimmer, Max, Pokutta, Sebastian

arXiv.org Artificial Intelligence 

Certifying nonnegativity of polynomials is a well-known NP-hard problem with direct applications spanning non-convex optimization, control, robotics, and beyond. A sufficient condition for nonnegativity is the Sum of Squares (SOS) property, i.e., it can be written as a sum of squares of other polynomials. In practice, however, certifying the SOS criterion remains computationally expensive and often involves solving a Semidefinite Program (SDP), whose dimensionality grows quadratically in the size of the monomial basis of the SOS expression; hence, various methods to reduce the size of the monomial basis have been proposed. In this work, we introduce the first learning-augmented algorithm to certify the SOS criterion. To this end, we train a Transformer model that predicts an almost-minimal monomial basis for a given polynomial, thereby drastically reducing the size of the corresponding SDP. Our overall methodology comprises three key components: efficient training dataset generation of over 100 million SOS polynomials, design and training of the corresponding Transformer architecture, and a systematic fallback mechanism to ensure correct termination, which we analyze theoretically. We validate our approach on over 200 benchmark datasets, achieving speedups of over 100 compared to state-of-the-art solvers and enabling the solution of instances where competing approaches fail. Our findings provide novel insights towards transforming the practical scalability of SOS programming. Since this expression is positive when γ < 1 and zero when γ = 1, we conclude that γ = 1 is the global minimum of the polynomial. While this example admits a straightforward algebraic verification, deciding nonnegativity is NP-hard in general, even for simple quartic polynomials such as the one above, which motivates the use of convex relaxations (Ahmadi et al., 2011). The method guarantees correctness: if a SOS certificate exists, it will be found; otherwise, infeasibility is certified at the full basis.