AITopics | cumulant

By far the most common way to estimate an expected loss in machine learning is to draw samples, compute the loss on each one, and take the empirical average. However, sampling is not necessarily optimal. Given an MLP at initialization, we show how to estimate its expected output over Gaussian inputs without running samples through the network at all. Instead, we produce approximate representations of the distributions of activations at each layer, leveraging tools such as cumulants and Hermite expansions. We show both theoretically and empirically that for sufficiently wide networks, our estimator achieves a target mean squared error using substantially fewer FLOPs than Monte Carlo sampling. We find moreover that our methods perform particularly well at estimating the probabilities of rare events, and additionally demonstrate how they can be used for model training. Together, these findings suggest a path to producing models with a greatly reduced probability of catastrophic tail risks.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Machine Learning

2605.05179

Country:

North America > United States (0.28)
Europe (0.27)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

From Information Geometry to Jet Substructure: A Triality of Cumulant Tensors, Energy Correlators, and Hypergraphs

Bal, Aritra, Klute, Markus, Maier, Benedikt, Spannowsky, Michael

arXiv.org Machine LearningMay-8-2026

Pairwise Fisher graphs capture local covariance information, but they cannot distinguish an irreducible multi-observable radiation pattern from a collection of ordinary pairwise correlations. We show that this missing structure is naturally supplied by higher-order Fisher tensors. In a finite basis of binned EECs, ECFs, or EFPs, and in the natural exponential-family coordinates generated by that basis, the same local tensor has three equivalent interpretations: a coefficient in the local Kullback-Leibler expansion, a connected cumulant of the chosen correlator observables, and a signed weight on a hyperedge linking those observables. This gives an exact Fisher-correlator-hypergraph triality in the local exponential-family embedding. The triality provides a direct construction of physics-informed hypergraphs from correlator data. Extending the quadratic Fisher matrix to the first non-trivial higher tensor identifies genuinely connected multi-observable radiation patterns, supplies hyperedge weights for higher-order Laplacians and message passing, and gives a principled criterion for compressing observable bases beyond pairwise information. We develop these constructions and spell out why the exact cumulant interpretation is special to natural exponential-family coordinates. We illustrate the framework in four applications. In a minimal local-KL study, the cubic Fisher tensor reduces the KL truncation error and isolates the dominant triplet structure. In a two-versus-three prong jet substructure benchmark, the hypergraph selector improves compressed-basis classification. In a 33-observable basis-design problem, the Fisher hypergraph retains more third-order local response at twelve observables. A low-capacity learning benchmark then shows how the same Fisher hyperedges can be used as an interpretable inductive bias for message passing on correlator observables.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

2605.03063

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

b05bffeb1ef937677ef0e32f027b4c80-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-27-2026, 05:01:49 GMT

anc, artificial intelligence, vertex cut, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.45)

Add feedback

243697ace81f57daef8737ff2c5cffd3-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 23:03:39 GMT

artificial intelligence, cumulant, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe (0.67)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

1f69928210578f4cf5b538a8c8806798-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 18:05:13 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.93)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

31917677a66c6eddd3ab1f68b0679e2f-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 09:20:49 GMT

artificial intelligence, machine learning, tanh 2, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

2d290e496d16c9dcaa9b4ded5cac10cc-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 07:24:08 GMT

This appendix contains a proofs of the results in the main text and further analysis on the two FIM estimators ˆI1(θ)and ˆI2(θ). In particular, Appendix C presents an analysis of how the FIM estimators and their covariance tensors change under reparametrization. Appendix D presents element-wise bound alternatives to those presented in Section 3.2. Appendix E explores various results using alternative norms to the Frobenius norm results of the main text. Appendix F presents an analysis on taking a linear combination of the two FIM estimators.

artificial intelligence, estimator, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The monotonicity of the Franz-Parisi potential is equivalent with Low-degree MMSE lower bounds

Tsirkas, Konstantinos, Wang, Leda, Zadik, Ilias

arXiv.org Machine LearningMar-23-2026

Over the last decades, two distinct approaches have been instrumental to our understanding of the computational complexity of statistical estimation. The statistical physics literature predicts algorithmic hardness through local stability and monotonicity properties of the Franz--Parisi (FP) potential \cite{franz1995recipes,franz1997phase}, while the mathematically rigorous literature characterizes hardness via the limitations of restricted algorithmic classes, most notably low-degree polynomial estimators \cite{hopkins2017efficient}. For many inference models, these two perspectives yield strikingly consistent predictions, giving rise to a long-standing open problem of establishing a precise mathematical relationship between them. In this work, we show that for estimation problems the power of low-degree polynomials is equivalent to the monotonicity of the annealed FP potential for a broad family of Gaussian additive models (GAMs) with signal-to-noise ratio $λ$. In particular, subject to a low-degree conjecture for GAMs, our results imply that the polynomial-time limits of these models are directly implied by the monotonicity of the annealed FP potential, in conceptual agreement with predictions from the physics literature dating back to the 1990s.

artificial intelligence, lemma 10, machine learning, (16 more...)

arXiv.org Machine Learning

2603.2007

Country: