AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Feature Augmentation of GNNs for ILPs: Local Uniqueness Suffices

Han, Qingyu, Li, Qian, Yang, Linxin, Chen, Qian, Shi, Qingjiang, Sun, Ruoyu

arXiv.org Artificial IntelligenceSep-26-2025

Integer Linear Programs (ILPs) are central to real-world optimizations but notoriously difficult to solve. Learning to Optimize (L2O) has emerged as a promising paradigm, with Graph Neural Networks (GNNs) serving as the standard backbone. However, standard anonymous GNNs are limited in expressiveness for ILPs, and the common enhancement of augmenting nodes with globally unique identifiers (UIDs) typically introduces spurious correlations that severely harm generalization. To address this tradeoff, we propose a parsimonious Local-UID scheme based on d-hop uniqueness coloring, which ensures identifiers are unique only within each node's d-hop neighborhood. Building on this scheme, we introduce ColorGNN, which incorporates color information via color-conditioned embeddings, and ColorUID, a lightweight feature-level variant. We prove that for d-layer networks, Local-UIDs achieve the expressive power of Global-UIDs while offering stronger generalization. Extensive experiments show that our approach (i) yields substantial gains on three ILP benchmarks, (ii) exhibits strong OOD generalization on linear programming datasets, and (iii) further improves a general graph-level task when paired with a state-of-the-art method.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2509.21

Country: Asia > China (0.47)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)
(3 more...)

Add feedback

HyP-ASO: A Hybrid Policy-based Adaptive Search Optimization Framework for Large-Scale Integer Linear Programs

Xu, Ning, Zhang, Junkai, Wu, Yang, Ye, Huigen, Xu, Hua, Xu, Huiling, Zhang, Yifan

arXiv.org Artificial IntelligenceSep-23-2025

Directly solving large-scale Integer Linear Programs (ILPs) using traditional solvers is slow due to their NP-hard nature. While recent frameworks based on Large Neighborhood Search (LNS) can accelerate the solving process, their performance is often constrained by the difficulty in generating sufficiently effective neighborhoods. To address this challenge, we propose HyP-ASO, a hybrid policy-based adaptive search optimization framework that combines a customized formula with deep Reinforcement Learning (RL). The formula leverages feasible solutions to calculate the selection probabilities for each variable in the neighborhood generation process, and the RL policy network predicts the neighborhood size. Extensive experiments demonstrate that HyP-ASO significantly outperforms existing LNS-based approaches for large-scale ILPs. Additional experiments show it is lightweight and highly scalable, making it well-suited for solving large-scale ILPs.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2509.15828

Country: Asia > China (0.29)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning from Heterophilic Graphs: A Spectral Theory Perspective on the Impact of Self-Loops and Parallel Edges

Bose, Kushal, Das, Swagatam

arXiv.org Artificial IntelligenceSep-17-2025

Graph heterophily poses a formidable challenge to the performance of Message-passing Graph Neural Networks (MP-GNNs). The familiar low-pass filters like Graph Convolutional Networks (GCNs) face performance degradation, which can be attributed to the blending of the messages from dissimilar neighboring nodes. The performance of the low-pass filters on heterophilic graphs still requires an in-depth analysis. In this context, we update the heterophilic graphs by adding a number of self-loops and parallel edges. We observe that eigenvalues of the graph Laplacian decrease and increase respectively by increasing the number of self-loops and parallel edges. We conduct several studies regarding the performance of GCN on various benchmark heterophilic networks by adding either self-loops or parallel edges. The studies reveal that the GCN exhibited either increasing or decreasing performance trends on adding self-loops and parallel edges. In light of the studies, we established connections between the graph spectra and the performance trends of the low-pass filters on the heterophilic graphs. The graph spectra characterize the essential intrinsic properties of the input graph like the presence of connected components, sparsity, average degree, cluster structures, etc. Our work is adept at seamlessly evaluating graph spectrum and properties by observing the performance trends of the low-pass filters without pursuing the costly eigenvalue decomposition. The theoretical foundations are also discussed to validate the impact of adding self-loops and parallel edges on the graph spectrum.

artificial intelligence, eigenvalue, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.13139

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

Information Entropy-Based Scheduling for Communication-Efficient Decentralized Learning

Nagar, Jaiprakash, Chen, Zheng, Kountouris, Marios, Stavrou, Photios A.

arXiv.org Artificial IntelligenceSep-16-2025

This paper addresses decentralized stochastic gradient descent (D-SGD) over resource-constrained networks by introducing node-based and link-based scheduling strategies to enhance communication efficiency. In each iteration of the D-SGD algorithm, only a few disjoint subsets of nodes or links are randomly activated, subject to a given communication cost constraint. We propose a novel importance metric based on information entropy to determine node and link scheduling probabilities. We validate the effectiveness of our approach through extensive simulations, comparing it against state-of-the-art methods, including betweenness centrality (BC) for node scheduling and \textit{MATCHA} for link scheduling. The results show that our method consistently outperforms the BC-based method in the node scheduling case, achieving faster convergence with up to 60\% lower communication budgets. At higher communication budgets (above 60\%), our method maintains comparable or superior performance. In the link scheduling case, our method delivers results that are superior to or on par with those of \textit{MATCHA}.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

2507.17426

Country: Europe > France (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.49)

Add feedback

Low-degree lower bounds via almost orthonormal bases

Carpentier, Alexandra, Giancola, Simone Maria, Giraud, Christophe, Verzelen, Nicolas

arXiv.org Machine LearningSep-12-2025

Low-degree polynomials have emerged as a powerful paradigm for providing evidence of statistical-computational gaps across a variety of high-dimensional statistical models [Wein25]. For detection problems -- where the goal is to test a planted distribution $\mathbb{P}'$ against a null distribution $\mathbb{P}$ with independent components -- the standard approach is to bound the advantage using an $\mathbb{L}^2(\mathbb{P})$-orthonormal family of polynomials. However, this method breaks down for estimation tasks or more complex testing problems where $\mathbb{P}$ has some planted structures, so that no simple $\mathbb{L}^2(\mathbb{P})$-orthogonal polynomial family is available. To address this challenge, several technical workarounds have been proposed [SW22,SW25], though their implementation can be delicate. In this work, we propose a more direct proof strategy. Focusing on random graph models, we construct a basis of polynomials that is almost orthonormal under $\mathbb{P}$, in precisely those regimes where statistical-computational gaps arise. This almost orthonormal basis not only yields a direct route to establishing low-degree lower bounds, but also allows us to explicitly identify the polynomials that optimize the low-degree criterion. This, in turn, provides insights into the design of optimal polynomial-time algorithms. We illustrate the effectiveness of our approach by recovering known low-degree lower bounds, and establishing new ones for problems such as hidden subcliques, stochastic block models, and seriation models.

lemma, node, polynomial, (14 more...)

arXiv.org Machine Learning

2509.09353

Country:

Europe > Germany > Brandenburg > Potsdam (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > France > Occitanie > Hérault > Montpellier (0.04)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Add feedback

Contributions to Robust and Efficient Methods for Analysis of High Dimensional Data

Yang, Kai

arXiv.org Artificial IntelligenceSep-11-2025

A ubiquitous feature of data of our era is their extra-large sizes and dimensions. Analyzing such high-dimensional data poses significant challenges, since the feature dimension is often much larger than the sample size. This thesis introduces robust and computationally efficient methods to address several common challenges associated with high-dimensional data. In my first manuscript, I propose a coherent approach to variable screening that accommodates nonlinear associations. I develop a novel variable screening method that transcends traditional linear assumptions by leveraging mutual information, with an intended application in neuroimaging data. This approach allows for accurate identification of important variables by capturing nonlinear as well as linear relationships between the outcome and covariates. Building on this foundation, I develop new optimization methods for sparse estimation using nonconvex penalties in my second manuscript. These methods address notable challenges in current statistical computing practices, facilitating computationally efficient and robust analyses of complex datasets. The proposed method can be applied to a general class of optimization problems. In my third manuscript, I contribute to robust modeling of high-dimensional correlated observations by developing a mixed-effects model based on Tsallis power-law entropy maximization and discussed the theoretical properties of such distribution. This model surpasses the constraints of conventional Gaussian models by accommodating a broader class of distributions with enhanced robustness to outliers. Additionally, I develop a proximal nonlinear conjugate gradient algorithm that accelerates convergence while maintaining numerical stability, along with rigorous statistical properties for the proposed framework.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2509.08155

Country:

Europe (0.92)
North America > United States > New York (0.28)
North America > Canada > Quebec (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.45)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Mathematics of Computing (1.00)
Information Technology > Data Science > Data Mining (1.00)
(4 more...)

Add feedback

Quality control in sublinear time: a case study via random graphs

Marcussen, Cassandra, Rubinfeld, Ronitt, Sudan, Madhu

arXiv.org Artificial IntelligenceSep-8-2025

Many algorithms are designed to work well on average over inputs. When running such an algorithm on an arbitrary input, we must ask: Can we trust the algorithm on this input? We identify a new class of algorithmic problems addressing this, which we call "Quality Control Problems." These problems are specified by a (positive, real-valued) "quality function" $ρ$ and a distribution $D$ such that, with high probability, a sample drawn from $D$ is "high quality," meaning its $ρ$-value is near $1$. The goal is to accept inputs $x \sim D$ and reject potentially adversarially generated inputs $x$ with $ρ(x)$ far from $1$. The objective of quality control is thus weaker than either component problem: testing for "$ρ(x) \approx 1$" or testing if $x \sim D$, and offers the possibility of more efficient algorithms. In this work, we consider the sublinear version of the quality control problem, where $D \in Δ(\{0,1\}^N)$ and the goal is to solve the $(D ,ρ)$-quality problem with $o(N)$ queries and time. As a case study, we consider random graphs, i.e., $D = G_{n,p}$ (and $N = \binom{n}2$), and the $k$-clique count function $ρ_k := C_k(G)/\mathbb{E}_{G' \sim G_{n,p}}[C_k(G')]$, where $C_k(G)$ is the number of $k$-cliques in $G$. Testing if $G \sim G_{n,p}$ with one sample, let alone with sublinear query access to the sample, is of course impossible. Testing if $ρ_k(G)\approx 1$ requires $p^{-Ω(k^2)}$ samples. In contrast, we show that the quality control problem for $G_{n,p}$ (with $n \geq p^{-ck}$ for some constant $c$) with respect to $ρ_k$ can be tested with $p^{-O(k)}$ queries and time, showing quality control is provably superpolynomially more efficient in this setting. More generally, for a motif $H$ of maximum degree $Δ(H)$, the respective quality control problem can be solved with $p^{-O(Δ(H))}$ queries and running time.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2508.16531

Country:

Europe (1.00)
North America > United States > Massachusetts (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Denoising and Reconstruction of Nonlinear Dynamics using Truncated Reservoir Computing

Sedehi, Omid, Yadav, Manish, Stender, Merten, Oberst, Sebastian

arXiv.org Artificial IntelligenceSep-8-2025

Measurements acquired from distributed physical systems are often sparse and noisy. Therefore, signal processing and system identification tools are required to mitigate noise effects and reconstruct unobserved dynamics from limited sensor data. However, this process is particularly challenging because the fundamental equations governing the dynamics are largely unavailable in practice. Reservoir Computing (RC) techniques have shown promise in efficiently simulating dynamical systems through an unstructured and efficient computation graph comprising a set of neurons with random connectivity. However, the potential of RC to operate in noisy regimes and distinguish noise from the primary smooth or non-smooth deterministic dynamics of the system has not been fully explored. This paper presents a novel RC method for noise filtering and reconstructing unobserved nonlinear dynamics, offering a novel learning protocol associated with hyperparameter optimization. The performance of the RC in terms of noise intensity, noise frequency content, and drastic shifts in dynamical parameters is studied in two illustrative examples involving the nonlinear dynamics of the Lorenz attractor and the adaptive exponential integrate-and-fire system. It is demonstrated that denoising performance improves by truncating redundant nodes and edges of the reservoir, as well as by properly optimizing hyperparameters, such as the leakage rate, spectral radius, input connectivity, and ridge regression parameter. Furthermore, the presented framework shows good generalization behavior when tested for reconstructing unseen and qualitatively different attractors. Compared to the extended Kalman filter, the presented RC framework yields competitive accuracy at low signal-to-noise ratios and high-frequency ranges.

artificial intelligence, machine learning, noise, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1063/5.0273505

2504.13355

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.83)

Add feedback

Testing for correlation between network structure and high-dimensional node covariates

Fuchs-Kreiss, Alexander, Levin, Keith

arXiv.org Machine LearningSep-5-2025

In many application domains, networks are observed with node-level features. In such settings, a common problem is to assess whether or not nodal covariates are correlated with the network structure itself. Here, we present four novel methods for addressing this problem. Two of these are based on a linear model relating node-level covariates to latent node-level variables that drive network structure. The other two are based on applying canonical correlation analysis to the node features and network structure, avoiding the linear modeling assumptions. We provide theoretical guarantees for all four methods when the observed network is generated according to a low-rank latent space model endowed with node-level covariates, which we allow to be high-dimensional. Our methods are computationally cheaper and require fewer modeling assumptions than previous approaches to network dependency testing. We demonstrate and compare the performance of our novel methods on both simulated and real-world data.

artificial intelligence, machine learning, social media, (20 more...)

arXiv.org Machine Learning

2509.03772

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Therapeutic Area > Neurology (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)

Add feedback

Identifiability and minimality bounds of quantum and post-quantum models of classical stochastic processes

Riechers, Paul M., Elliott, Thomas J.

arXiv.org Artificial IntelligenceSep-4-2025

To make sense of the world around us, we develop models, constructed to enable us to replicate, describe, and explain the behaviours we see. Focusing on the broad case of sequences of correlated random variables, i.e., classical stochastic processes, we tackle the question of determining whether or not two different models produce the same observable behavior. This is the problem of identifiability. Curiously, the physics of the model need not correspond to the physics of the observations; recent work has shown that it is even advantageous -- in terms of memory and thermal efficiency -- to employ quantum models to generate classical stochastic processes. We resolve the identifiability problem in this regime, providing a means to compare any two models of a classical process, be the models classical, quantum, or `post-quantum', by mapping them to a canonical `generalized' hidden Markov model. Further, this enables us to place (sometimes tight) bounds on the minimal dimension required of a quantum model to generate a given classical stochastic process.

artificial intelligence, machine learning, stochastic process, (17 more...)

arXiv.org Artificial Intelligence

2509.03004

Country: North America > United States > California (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback