AITopics | theorem 8

The natural gradient method is widely used in statistical optimization, but its standard formulation assumes a Euclidean parameter space. This paper proposes an inversion-free stochastic natural gradient method for probability distributions whose parameters lie on a Riemannian manifold. The manifold setting offers several advantages: one can implicitly enforce parameter constraints such as positive definiteness and orthogonality, ensure parameters are identifiable, or guarantee regularity properties of the objective like geodesic convexity. Building on an intrinsic formulation of the Fisher information matrix (FIM) on a manifold, our method maintains an online approximation of the inverse FIM, which is efficiently updated at quadratic cost using score vectors sampled at successive iterates. In the Riemannian setting, these score vectors belong to different tangent spaces and must be combined using transport operations. We prove almost-sure convergence rates of $O(\log{s}/s^α)$ for the squared distance to the minimizer when the step size exponent $α>2/3$. We also establish almost-sure rates for the approximate FIM, which now accumulates transport-based errors. A limited-memory variant of the algorithm with sub-quadratic storage complexity is proposed. Finally, we demonstrate the effectiveness of our method relative to its Euclidean counterparts on variational Bayes with Gaussian approximations and normalizing flows.

artificial intelligence, machine learning, manifold, (18 more...)

arXiv.org Machine Learning

2604.02969

Country:

Europe > Belarus > Minsk Region > Minsk (0.04)
Asia > Middle East > Jordan (0.04)
South America > Argentina (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.65)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

On the Reliability Limits of LLM-Based Multi-Agent Planning

Ao, Ruicheng, Gao, Siyang, Simchi-Levi, David

arXiv.org Machine LearningMar-31-2026

This technical note studies the reliability limits of LLM-based multi-agent planning as a delegated decision problem. We model the LLM-based multi-agent architecture as a finite acyclic decision network in which multiple stages process shared model-context information, communicate through language interfaces with limited capacity, and may invoke human review. We show that, without new exogenous signals, any delegated network is decision-theoretically dominated by a centralized Bayes decision maker with access to the same information. In the common-evidence regime, this implies that optimizing over multi-agent directed acyclic graphs under a finite communication budget can be recast as choosing a budget-constrained stochastic experiment on the shared signal. We also characterize the loss induced by communication and information compression. Under proper scoring rules, the gap between the centralized Bayes value and the value after communication admits an expected posterior divergence representation, which reduces to conditional mutual information under logarithmic loss and to expected squared posterior error under the Brier score. These results characterize the fundamental reliability limits of delegated LLM planning. Experiments with LLMs on a controlled problem set further demonstrate these characterizations.

artificial intelligence, communication, information, (16 more...)

arXiv.org Machine Learning

2603.26993

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Kosovo > District of Gjilan > Kamenica (0.04)
Asia > China > Hong Kong > Kowloon (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

8249b30d877c91611fd8c7aa6ac2b5fe-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 14:01:02 GMT

abadi, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Proof of Proposition 2. If s = 0, the result is trivial. Hence, using the alternative formulationµs(X) = lnkeXks, we get that s 7 µs(X) is nondecreasing, andlims + µs(X) = ln(esssup(eX)) = esssupX. By definition of IX, φs0(X) and φs1(X) are integrable. The result follows from standard analysis of non-convex gradient descent. Hence, f(x) is inferior to the sum of both terms.

artificial intelligence, hxt, supplementarymaterial, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.35)

Add feedback