statistical manifold
Categorical Flow Matching on Statistical Manifolds
We introduce Statistical Flow Matching (SFM), a novel and mathematically rigorous flow-matching framework on the manifold of parameterized probability measures inspired by the results from information geometry. We demonstrate the effectiveness of our method on the discrete generation problem by instantiat-ing SFM on the manifold of categorical distributions whose geometric properties remain unexplored in previous discrete generative models. Utilizing the Fisher information metric, we equip the manifold with a Riemannian structure whose intrinsic geometries are effectively leveraged by following the shortest paths of geodesics. We develop an efficient training and sampling algorithm that overcomes numerical stability issues with a diffeomorphism between manifolds. Our distinctive geometric perspective of statistical manifolds allows us to apply optimal transport during training and interpret SFM as following the steepest direction of the natural gradient. Unlike previous models that rely on variational bounds for likelihood estimation, SFM enjoys the exact likelihood calculation for arbitrary probability measures. We manifest that SFM can learn more complex patterns on the statistical manifold where existing models often fail due to strong prior assumptions. Comprehensive experiments on real-world generative tasks ranging from image, text to biological domains further demonstrate that SFM achieves higher sampling quality and likelihood than other discrete diffusion or flow-based models. Our code is available at https://github.com/ccr-cheng/
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > New York (0.04)
- North America > Canada > Quebec (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Spectral Concentration at the Edge of Stability: Information Geometry of Kernel Associative Memory
Recent advances using Kernel Logistic Regression (KLR) have demonstrated that learning can sculpt these landscapes to achieve capacities far exceeding classical limits [1-3]. Our previous phenomenological analysis identified a Ridge of Optimization where stability is maximized via a mechanism we termed Spectral Concentration, defined as a state where the weight spectrum exhibits a sharp hierarchy [4]. However, a deeper question remains: Why does the learning dynamics self-organize into this specific spectral state? Why does the system operate at the brink of instability? T o answer these questions, we must look beyond the Euclidean geometry of the weight parameters and consider the intrinsic geometry of the probability distributions they represent. This is the domain of Information Geometry [5]. In this work, we reinterpret the KLR Hopfield network as a statistical manifold equipped with a Fisher-Rao metric.
Learning under Distributional Drift: Reproducibility as an Intrinsic Statistical Resource
Statistical learning under distributional drift remains insufficiently characterized: when each observation alters the data-generating law, classical generalization bounds can collapse. We introduce a new statistical primitive, the reproducibility budget $C_T$, which quantifies a system's finite capacity for statistical reproducibility - the extent to which its sampling process can remain governed by a consistent underlying distribution in the presence of both exogenous change and endogenous feedback. Formally, $C_T$ is defined as the cumulative Fisher-Rao path length of the coupled learner-environment evolution, measuring the total distributional motion accumulated during learning. From this construct we derive a drift-feedback generalization bound of order $O(T^{-1/2} + C_T/T)$, and we prove a matching minimax lower bound showing that this rate is minimax-optimal. Consequently, the results establish a reproducibility speed limit: no algorithm can achieve smaller worst-case generalization error than that imposed by the average Fisher-Rao drift rate $C_T/T$ of the data-generating process. The framework situates exogenous drift, adaptive data analysis, and performative prediction within a common geometric structure, with $C_T$ emerging as the intrinsic quantity measuring distributional motion across these settings.
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (7 more...)
Self-Improving AI Agents through Self-Play
We extend the moduli-theoretic framework of psychometric batteries to the domain of dynamical systems. While previous work established the AAI capability score as a static functional on the space of agent representations, this paper formalizes the agent as a flow $ν_r$ parameterized by computational resource $r$, governed by a recursive Generator-Verifier-Updater (GVU) operator. We prove that this operator generates a vector field on the parameter manifold $Θ$, and we identify the coefficient of self-improvement $κ$ as the Lie derivative of the capability functional along this flow. The central contribution of this work is the derivation of the Variance Inequality, a spectral condition that is sufficient (under mild regularity) for the stability of self-improvement. We show that a sufficient condition for $κ> 0$ is that, up to curvature and step-size effects, the combined noise of generation and verification must be small enough. We then apply this formalism to unify the recent literature on Language Self-Play (LSP), Self-Correction, and Synthetic Data bootstrapping. We demonstrate that architectures such as STaR, SPIN, Reflexion, GANs and AlphaZero are specific topological realizations of the GVU operator that satisfy the Variance Inequality through filtration, adversarial discrimination, or grounding in formal systems.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Cognitive Science (0.93)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > New York (0.04)
- North America > Canada > Quebec (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
- North America > United States > New York > New York County > New York City (0.14)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.05)
- North America > United States > California (0.04)
Enforcing Latent Euclidean Geometry in Single-Cell VAEs for Manifold Interpolation
Palma, Alessandro, Rybakov, Sergei, Hetzel, Leon, Günnemann, Stephan, Theis, Fabian J.
Latent space interpolations are a powerful tool for navigating deep generative models in applied settings. An example is single-cell RNA sequencing, where existing methods model cellular state transitions as latent space interpolations with variational autoencoders, often assuming linear shifts and Euclidean geometry. However, unless explicitly enforced, linear interpolations in the latent space may not correspond to geodesic paths on the data manifold, limiting methods that assume Euclidean geometry in the data representations. We introduce FlatVI, a novel training framework that regularises the latent manifold of discrete-likelihood variational autoencoders towards Euclidean geometry, specifically tailored for modelling single-cell count data. By encouraging straight lines in the latent space to approximate geodesic interpolations on the decoded single-cell manifold, FlatVI enhances compatibility with downstream approaches that assume Euclidean latent geometry. Experiments on synthetic data support the theoretical soundness of our approach, while applications to time-resolved single-cell RNA sequencing data demonstrate improved trajectory reconstruction and manifold interpolation.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Asia > Middle East > Jordan (0.04)
Geometric Gaussian Approximations of Probability Distributions
Da Costa, Nathaël, Mucsányi, Bálint, Hennig, Philipp
Approximating complex probability distributions, such as Bayesian posterior distributions, is of central interest in many applications. We study the expressivity of geometric Gaussian approximations. These consist of approximations by Gaussian pushforwards through diffeomorphisms or Riemannian exponential maps. We first review these two different kinds of geometric Gaussian approximations. Then we explore their relationship to one another. We further provide a constructive proof that such geometric Gaussian approximations are universal, in that they can capture any probability distribution. Finally, we discuss whether, given a family of probability distributions, a common diffeomorphism can be found to obtain uniformly high-quality geometric Gaussian approximations for that family.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Categorical Flow Matching on Statistical Manifolds
We introduce Statistical Flow Matching (SFM), a novel and mathematically rigorous flow-matching framework on the manifold of parameterized probability measures inspired by the results from information geometry. We demonstrate the effectiveness of our method on the discrete generation problem by instantiating SFM on the manifold of categorical distributions whose geometric properties remain unexplored in previous discrete generative models. Utilizing the Fisher information metric, we equip the manifold with a Riemannian structure whose intrinsic geometries are effectively leveraged by following the shortest paths of geodesics. We develop an efficient training and sampling algorithm that overcomes numerical stability issues with a diffeomorphism between manifolds. Our distinctive geometric perspective of statistical manifolds allows us to apply optimal transport during training and interpret SFM as following the steepest direction of the natural gradient.
Generative Distribution Embeddings
Fishman, Nic, Gowri, Gokul, Yin, Peng, Gootenberg, Jonathan, Abudayyeh, Omar
Many real-world problems require reasoning across multiple scales, demanding models which operate not on single data points, but on entire distributions. We introduce generative distribution embeddings (GDE), a framework that lifts autoencoders to the space of distributions. In GDEs, an encoder acts on sets of samples, and the decoder is replaced by a generator which aims to match the input distribution. This framework enables learning representations of distributions by coupling conditional generative models with encoder networks which satisfy a criterion we call distributional invariance. We show that GDEs learn predictive sufficient statistics embedded in the Wasserstein space, such that latent GDE distances approximately recover the $W_2$ distance, and latent interpolation approximately recovers optimal transport trajectories for Gaussian and Gaussian mixture distributions. We systematically benchmark GDEs against existing approaches on synthetic datasets, demonstrating consistently stronger performance. We then apply GDEs to six key problems in computational biology: learning representations of cell populations from lineage-tracing data (150K cells), predicting perturbation effects on single-cell transcriptomes (1M cells), predicting perturbation effects on cellular phenotypes (20M single-cell images), modeling tissue-specific DNA methylation patterns (253M sequences), designing synthetic yeast promoters (34M sequences), and spatiotemporal modeling of viral protein sequences (1M sequences).
- North America > United States (0.28)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Middle East > Israel (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Data Science (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)