AITopics | propagation

Collaborating Authors

propagation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Iterative Causal Discovery: Per-Edge Impossibility Certificates, Tier-Aware Oracle Queries, and the $1+K$ Lower Bound

Uehara, Eichi

arXiv.org Machine LearningMay-28-2026

Causal-discovery algorithms return a directed graph, yet provide no principled means of distinguishing edge directions identified by the data from those assigned without an identifying assumption. Under the standard Markov and faithfulness conditions, the observational distribution identifies only a Markov equivalence class; orientations within that class are not determined by the joint distribution and cannot be recovered from additional samples alone, but require either a functional restriction or an intervention. We introduce a protocol for observational causal discovery on continuous data that attaches to each candidate edge a discrete impossibility certificate: a RESOLVED code records the identifiability theorem under which the direction was committed, while an IMPOSSIBLE code records the failure mode together with the specific question a domain expert must answer to resolve it. The bivariate cascade is extended with five gated identifiability tiers LSNM, IGCI, Stein, MDL, and PEIT that abstain when their precondition test rejects. Two oracle primitives, the meta-hub query and the node-children query, jointly establish an upper bound of $1+K$ expert interactions sufficient to recover any DAG, where $K$ denotes the number of non-leaf vertices. Under an ideal-oracle assumption, the bound is met exactly on the asia, sachs, child, and alarm benchmarks.

artificial intelligence, machine learning, query, (17 more...)

arXiv.org Machine Learning

2605.27477

Country: Asia > Japan (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Memory, Roughness, and Information Persistence in Financial Markets: A Structural Approach to Volatility Forecasting

Deep, Akash, Appiah, Nicholas, Rachev, Svetlozar T.

arXiv.org Machine LearningMay-26-2026

This paper studies the joint role of long-memory dynamics,rough-volatility behavior, and persistence-based forecasting features in equity volatility modeling. We combine semiparametric long-memory estimation, rough-volatility diagnostics, and structured forecasting regressions to examine whether persistence measures contain economically meaningful forecasting information beyond conventional volatility predictors. Using a panel of 115 S&P500 constituents from November 2001 through April 2026, we document that volatility proxies exhibit substantial long-memory behavior and locally rough dynamics. The cross-sectional mean Geweke-Porter-Hudak estimate of the memory parameter is $\hat{d} = 0.226$, while the corresponding local-Whittle estimate is $\hat{d} = 0.440$, with statistical significance observed across nearly the entire panel. Rolling estimates of persistence rise substantially during the global financial crisis and the COVID period and display a positive contemporaneous association with the VIX. We then examine whether persistence-related features improve out-of-sample volatility forecasts beyond standard HAR and HAR-X benchmarks. Incorporating cross-sectional persistence aggregates, sectoral persistence measures, and persistence-by-stress interaction terms produces moderate but statistically significant forecasting improvements, particularly at longer horizons and during stress regimes. Forecast gains are strongest during periods of elevated market volatility and in volatility-managed portfolio applications. The results suggest that persistence measures may serve as useful reduced-form indicators of the duration and propagation of uncertainty in financial markets, although the paper does not claim structural identification of the economic mechanisms generating persistence.

artificial intelligence, machine learning, persistence, (18 more...)

arXiv.org Machine Learning

2605.24285

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance > Trading (1.00)
Banking & Finance > Economy (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Dropout Universality: Scaling Laws and Optimal Scheduling at the Edge-of-Chaos

Sarmiento, Lucas Fernandez

arXiv.org Machine LearningMay-22-2026

We develop a mean-field theory of dropout as a perturbation of critical signal propagation at the edge of chaos. Dropout shifts the perfect-alignment fixed point, making the depth scale for information propagation finite even at critical initialization. We derive critical and crossover scaling laws for correlation decay and establish that smooth activations and kinked, ReLU-like activations constitute distinct universality classes, with different critical exponents and a universal two-parameter scaling collapse in detuning and dropout strength. The distinction traces to the analytic structure of the correlation map: smooth activations admit a Taylor expansion near perfect alignment, while kinked activations develop a branch point with universal non-analyticity. As a corollary, the framework yields saturated dropout profiles under fixed budget; a rank-flow tie-breaker then selects front-loaded schedules, substantially reducing held-out test loss at no extra computational cost, with accuracy gains as a consistent secondary effect. We test the predictions in MLPs and Vision Transformers and discuss CNN/ResNet extensions.

artificial intelligence, dropout, machine learning, (15 more...)

arXiv.org Machine Learning

2605.21648

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.64)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Uniform-in-Time Weak Propagation-of-Chaos in Shallow Neural Networks

Glasgow, Margalit, Bruna, Joan

arXiv.org Machine LearningMay-22-2026

We consider one-hidden layer neural networks trained in the feature-learning regime using gradient descent, and relate the output of the finite-width network $f_{\hatρ_t^m}$ to its infinite-width counterpart $f_{ρ_t^{MF}}$, which evolves in the mean-field dynamics. While constant-time horizon bounds for $\|f_{ρ_t^{MF}} - f_{\hatρ_t^m}\|$ may be obtained via standard Grönwall estimates, the long-time behavior of the fluctuation is a more delicate matter. Uniform-in-time bounds often rely on (local) strong convexity in the landscape or Logarithmic Sobolev inequalities present in noisy gradient dynamics. In this work, we establish non-asymptotic weak propagation-of-chaos that holds uniformly in time, obtained by exploiting instead the convergence rate of the mean-field deterministic Wasserstein-gradient-flow dynamics. Specifically, denoting by $L_t$ the mean-field excess MSE loss at time $t$ and $m$ the number of neurons, under standard regularity assumptions and the condition $\int_0^\infty L_t^{1/2} dt =O(\log d)$, we obtain the uniform in time bound $\|f_{ρ_t^{MF}}- f_{\hatρ_t^m}\|^2 \lesssim \text{poly}(d) m^{-\min(1,c/6)}$ whenever $L_t \lesssim t^{-c}$. Our result holds in a noiseless setting and does not make any assumptions on the geometry of the landscape near the optimum, and extends seamlessly to other forms of discretization, including finite number of samples and time discretization. A key takeaway of our result is that whenever the convergence rate of the mean-field, population-loss dynamics is faster than $t^{-2}$, we can attain a loss of $ε$ with only $\text{poly}(d/ε)$ neurons, training samples, and GD steps.

artificial intelligence, machine learning, mft, (18 more...)

arXiv.org Machine Learning

2605.2201

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Estimating the expected output of wide random MLPs more efficiently than sampling

Wu, Wilson, Lecomte, Victor, Winer, Michael, Robinson, George, Hilton, Jacob, Christiano, Paul

arXiv.org Machine LearningMay-18-2026

By far the most common way to estimate an expected loss in machine learning is to draw samples, compute the loss on each one, and take the empirical average. However, sampling is not necessarily optimal. Given an MLP at initialization, we show how to estimate its expected output over Gaussian inputs without running samples through the network at all. Instead, we produce approximate representations of the distributions of activations at each layer, leveraging tools such as cumulants and Hermite expansions. We show both theoretically and empirically that for sufficiently wide networks, our estimator achieves a target mean squared error using substantially fewer FLOPs than Monte Carlo sampling. We find moreover that our methods perform particularly well at estimating the probabilities of rare events, and additionally demonstrate how they can be used for model training. Together, these findings suggest a path to producing models with a greatly reduced probability of catastrophic tail risks.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Machine Learning

2605.05179

Country:

North America > United States (0.28)
Europe (0.27)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning to Propagate for Graph Meta-Learning

LU LIU, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

Neural Information Processing SystemsApr-30-2026, 19:37:30 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, prototype, (16 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.28)

Industry:

Information Technology (1.00)
Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

00ac8ed3b4327bdd4ebbebcb2ba10a00-AuthorFeedback.pdf

Neural Information Processing SystemsApr-30-2026, 19:37:16 GMT

artificial intelligence, machine learning, prototype, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Approximate inference of marginals using the IBIA framework

Neural Information Processing SystemsApr-30-2026, 03:08:28 GMT

Exact inference of marginals in probabilistic graphical models (PGM) is known to be intractable, necessitating the use of approximate methods. Most of the existing variational techniques perform iterative message passing in loopy graphs which is slow to converge for many benchmarks. In this paper, we propose a new algorithm for marginal inference that is based on the incremental build-infer-approximate (IBIA) paradigm. Our algorithm converts the PGM into a sequence of linked clique tree forests (SLCTF) with bounded clique sizes, and then uses a heuristic belief update algorithm to infer the marginals. For the special case of Bayesian networks, we show that if the incremental build step in IBIA uses the topological order of variables then (a) the prior marginals are consistent in all CTFs in the SLCTF and (b) the posterior marginals are consistent once all evidence variables are added to the SLCTF. In our approach, the belief propagation step is non-iterative and the accuracy-complexity trade-off is controlled using user-defined clique size bounds. Results for several benchmark sets from recent UAI competitions show that our method gives either better or comparable accuracy than existing variational and sampling based methods, with smaller runtimes.

artificial intelligence, ctf, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology: