orthogonal polynomial
Sorting out typicality with the inverse moment matrix SOS polynomial
We study a surprising phenomenon related to the representation of a cloud of data points using polynomials. We start with the previously unnoticed empirical observation that, given a collection (a cloud) of data points, the sublevel sets of a certain distinguished polynomial capture the shape of the cloud very accurately. This distinguished polynomial is a sum-of-squares (SOS) derived in a simple manner from the inverse of the empirical moment matrix. In fact, this SOS polynomial is directly related to orthogonal polynomials and the Christoffel function. This allows to generalize and interpret extremality properties of orthogonal polynomials and to provide a mathematical rationale for the observed phenomenon. Among diverse potential applications, we illustrate the relevance of our results on a network intrusion detection task for which we obtain performances similar to existing dedicated methods reported in the literature.
Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD
Stochastic gradient descent (SGD) is a cornerstone of machine learning. When the number $N$ of data items is large, SGD relies on constructing an unbiased estimator of the gradient of the empirical risk using a small subset of the original dataset, called a minibatch. Default minibatch construction involves uniformly sampling a subset of the desired size, but alternatives have been explored for variance reduction. In particular, experimental evidence suggests drawing minibatches from determinantal point processes (DPPs), tractable distributions over minibatches that favour diversity among selected items. However, like in recent work on DPPs for coresets, providing a systematic and principled understanding of how and why DPPs help has been difficult. In this work, we contribute an orthogonal polynomial-based determinantal point process paradigm for performing minibatch sampling in SGD.
MeixnerNet: Adaptive and Robust Spectral Graph Neural Networks with Discrete Orthogonal Polynomials
Spectral Graph Neural Networks (GNNs) have achieved state-of-the-art results by defining graph convolutions in the spectral domain. A common approach, popularized by ChebyNet, is to use polynomial filters based on continuous orthogonal polynomials (e.g., Chebyshev). This creates a theoretical disconnect, as these continuous-domain filters are applied to inherently discrete graph structures. We hypothesize this mismatch can lead to suboptimal performance and fragility to hyperparameter settings. In this paper, we introduce MeixnerNet, a novel spectral GNN architecture that employs discrete orthogonal polynomials -- specifically, the Meixner polynomials $M_k(x; ฮฒ, c)$. Our model makes the two key shape parameters of the polynomial, beta and c, learnable, allowing the filter to adapt its polynomial basis to the specific spectral properties of a given graph. We overcome the significant numerical instability of these polynomials by introducing a novel stabilization technique that combines Laplacian scaling with per-basis LayerNorm. We demonstrate experimentally that MeixnerNet achieves competitive-to-superior performance against the strong ChebyNet baseline at the optimal K = 2 setting (winning on 2 out of 3 benchmarks). More critically, we show that MeixnerNet is exceptionally robust to variations in the polynomial degree K, a hyperparameter to which ChebyNet proves to be highly fragile, collapsing in performance where MeixnerNet remains stable.
Approximating the universal thermal climate index using sparse regression with orthogonal polynomials
Roman, Sabin, Skok, Gregor, Todorovski, Ljupco, Dzeroski, Saso
This article explores novel data-driven modeling approaches for analyzing and approximating the Universal Thermal Climate Index (UTCI), a physiologically-based metric integrating multiple atmospheric variables to assess thermal comfort. Given the nonlinear, multivariate structure of UTCI, we investigate symbolic and sparse regression techniques as tools for interpretable and efficient function approximation. In particular, we highlight the benefits of using orthogonal polynomial bases-such as Legendre polynomials-in sparse regression frameworks, demonstrating their advantages in stability, convergence, and hierarchical interpretability compared to standard polynomial expansions. We demonstrate that our models achieve significantly lower root-mean squared losses than the widely used sixth-degree polynomial benchmark-while using the same or fewer parameters. By leveraging Legendre polynomial bases, we construct models that efficiently populate a Pareto front of accuracy versus complexity and exhibit stable, hierarchical coefficient structures across varying model capacities. Training on just 20% of the data, our models generalize robustly to the remaining 80%, with consistent performance under bootstrapping. The decomposition effectively approximates the UTCI as a Fourier-like expansion in an orthogonal basis, yielding results near the theoretical optimum in the L2 (least squares) sense. We also connect these findings to the broader context of equation discovery in environmental modeling, referencing probabilistic grammar-based methods that enforce domain consistency and compactness in symbolic expressions. Taken together, these results illustrate how combining sparsity, orthogonality, and symbolic structure enables robust, interpretable modeling of complex environmental indices like UTCI - and significantly outperforms the state-of-the-art approximation in both accuracy and efficiency.
R2 If there is one sampling scheme [ ] it must be explained better [ ] this part should be very clean and clear
We thank reviewers R1, R2 and R3 for their constructive com ments. We give here answers to the main ones. R2 If there is one sam pling scheme [...] it must be ex plained bet ter [...] this part should be very clean and clear . R1 I also ex pect more de tails on the method olog i cal mod i fi ca tion over BH for the pro posed sam pling method. We agree, and we have expanded Section 3.3 and Appendix A.4 to shed light on The original sampling algorithm for projection DPPs by Hough et al. (2006, Algorithm 18) works as follows.
Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD
Stochastic gradient descent (SGD) is a cornerstone of machine learning. When the number N of data items is large, SGD relies on constructing an unbiased estimator of the gradient of the empirical risk using a small subset of the original dataset, called a minibatch. Default minibatch construction involves uniformly sampling a subset of the desired size, but alternatives have been explored for variance reduction. In particular, experimental evidence suggests drawing minibatches from determinantal point processes (DPPs), tractable distributions over minibatches that favour diversity among selected items. However, like in recent work on DPPs for coresets, providing a systematic and principled understanding of how and why DPPs help has been difficult.