AITopics

2605.17154

Country: Europe > Netherlands (0.28)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.66)
Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)

Raha, Swarnali, Khare, Kshitij, Patra, Rohit K

Optimality of Sub-network Laplace Approximations: New Results and Methods

arXiv.org Machine LearningMay-12-2026

Although the Laplace approximation offers a simple route to uncertainty quantification in deep neural networks, its reliance on inverting large Hessian matrices has motivated a range of computationally feasible low-dimensional or sparse approximations. A prominent class of such methods - sub-network Laplace approximations, constructs surrogates by restricting attention to a small subset of parameters. Existing approaches in this family typically rely on diagonal, layer-wise, or other architectural heuristics for subset selection, which ignore cross-parameter interactions and lack formal optimality guarantees. In this paper, we provide a rigorous theoretical analysis of the sub-network Laplace paradigm. We prove that all sub-network Laplace methods systematically underestimate the predictive variance of the full Laplace posterior, and that this bias decreases monotonically as the retained sub-matrix expands. Leveraging this insight, we propose two principled, analytically grounded sub-network Hessian approximations: \textit{Gradient-Laplace} selects parameters with the largest average squared gradients of the model output with respect to the parameters over a reference dataset; while \textit{Greedy-Laplace} iteratively refines this selection by accounting for off-diagonal interactions in the precision matrix. We establish theoretical guarantees characterizing their optimality properties and show that Gradient-Laplace provably outperforms existing heuristic approaches. Extensive numerical studies across diverse settings indicate that these methods perform strongly relative to existing benchmarks.

approximation, artificial intelligence, machine learning, (19 more...)

2605.09075

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.82)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Oh, Sunmin, Oh, Sang-Yun, Park, Gunwoong

Relaxed Sparsest-Permutation Formulation for Causal Discovery at Scale

arXiv.org Machine LearningMay-8-2026

Despite the growing availability of large datasets, causal structure learning remains computationally prohibitive at scale. We revisit sparsest-permutation learning for linear structural equation models and show that exact Cholesky factorization is unnecessary for structure recovery. This observation motivates a support-level relaxation that searches for sparse triangular factors over a precision-support screening graph. The relaxed formulation can be efficiently evaluated via masked zero-fill incomplete Cholesky factorization, enabling scalable comparison of candidate orderings. At the population level, we establish soundness for Markov equivalence class (MEC) recovery under no-cancellation and sparsest Markov representation assumptions, as well as robustness to ordering misspecification. Motivated by these guarantees, we introduce SCOPE, a sparse-Cholesky pipeline that provides a scalable implementation of the relaxed formulation. Experiments on synthetic and real datasets demonstrate that SCOPE matches the MEC recovery accuracy of substantially slower baselines, while achieving significantly reduced runtime and scaling to 10k variables.

artificial intelligence, machine learning, optimization problem, (17 more...)

2605.05568

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.66)

Eugene Belilovsky, Gaël Varoquaux, Matthew B. Blaschko

Testing for Differences in Gaussian Graphical Models: Applications to Brain Connectivity

Neural Information Processing SystemsApr-30-2026, 23:09:41 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, lasso, machine learning, (18 more...)

Country: Europe (0.46)

Genre: Research Report > Experimental Study (0.95)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

Neural Information Processing SystemsApr-25-2026, 10:12:16 GMT

8 max

We proceed to show the sparsistency510 of the estimated parameters. First, suppose that Θ t;ij 6= 0 for some time tand index (i,j). Due to 0 < γ < 1, the above inequality implies that bΘt;ij = 0521 for every t and (i,j) 6 St, and bΘt;ij bΘt 1;ij = 0 for every t > 0 and (i,j) 6 Dt. The proof is inspired527 by Corollary 1 in [47]. First, we present the following key lemmas.528

artificial intelligence, precision matrix, runtime, (17 more...)

Industry: Banking & Finance > Trading (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.47)

Neural Information Processing SystemsApr-25-2026, 02:26:31 GMT

Learning to Learn Graph Topologies

Learning a graph topology to reveal the underlying relationship between data entities plays an important role in various machine learning and data analysis tasks. Under the assumption that structured data vary smoothly over a graph, the problem can be formulated as a regularised convex optimisation over a positive semidefinite cone and solved by iterative algorithms. Classic methods require an explicit convex function to reflect generic topological priors, e.g. the `1 penalty for enforcing sparsity, which limits the flexibility and expressiveness in learning rich topological structures. We propose to learn a mapping from node data to the graph structure based on the idea of learning to optimise (L2O). Specifically, our model first unrolls an iterative primal-dual splitting algorithm into a neural network. The key structural proximal projection is replaced with a variational autoencoder that refines the estimated graph with enhanced topological properties. The model is trained in an end-to-end fashion with pairs of node data and graph samples. Experiments on both synthetic and real-world data demonstrate that our model is more efficient than classic iterative algorithms in learning a graph with specific topological properties.

artificial intelligence, graph, machine learning, (15 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Autism (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsApr-24-2026, 15:33:38 GMT

Scalable Intervention Target Estimation in Linear Models

This paper considers the problem of estimating the unknown intervention targets in a causal directed acyclic graph from observational and interventional data. The focus is on soft interventions in linear structural equation models (SEMs). Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets even for linear SEMs. This severely limits their scalability and sample complexity. This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets.

algorithm, artificial intelligence, machine learning, (18 more...)

Country: North America (0.68)

Genre: Research Report (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Neural Information Processing SystemsApr-22-2026, 13:46:56 GMT

Semiparametric Differential Graph Models

Pan Xu, Quanquan Gu

Neural Information Processing Systems http://nips.cc/

artificial intelligence, estimator, machine learning, (14 more...)

Country:

Europe (0.46)
North America > United States (0.28)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.94)
Health & Medicine > Therapeutic Area > Oncology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Hejný, Ivan, Bonaccolto, Giovanni, Kremer, Philipp, Paterlini, Sandra, Bogdan, Małgorzata, Wallin, Jonas

Asymptotic Theory for Graphical SLOPE: Precision Estimation and Pattern Convergence

arXiv.org Machine LearningApr-15-2026

This paper studies Graphical SLOPE for precision matrix estimation, with emphasis on its ability to recover both sparsity and clusters of edges with equal or similar strength. In a fixed-dimensional regime, we establish that the root-$n$ scaled estimation error converges to the unique minimizer of a strictly convex optimization problem defined through the directional derivative of the SLOPE penalty. We also establish convergence of the induced SLOPE pattern, thereby obtaining an asymptotic characterization of the clustering structure selected by the estimator. A comparison with GLASSO shows that the grouping property of SLOPE can substantially improve estimation accuracy when the precision matrix exhibits structured edge patterns. To assess the effect of departures from Gaussianity, we then analyze Gaussian-loss precision matrix estimation under elliptical distributions. In this setting, we derive the limiting distribution and quantify the inflation in variability induced by heavy tails relative to the Gaussian benchmark. We also study TSLOPE, based on the multivariate $t$-loss, and derive its limiting distribution. The results show that TSLOPE offers clear advantages over GSLOPE under heavy-tailed data-generating mechanisms. Simulation evidence suggests that these qualitative conclusions persist in high-dimensional settings, and an empirical application shows that SLOPE-based estimators, especially TSLOPE, can uncover economically meaningful clustered dependence structures.

artificial intelligence, machine learning, matrix, (19 more...)