tar
Mean-Field Path-Integral Diffusion: From Samples to Interacting Agents
Independent sample generation is the prevailing paradigm in modern diffusion-based generative models of AI. We ask a different question: can samples coordinate through shared population statistics to transport probability mass more efficiently? We introduce Mean-Field Path-Integral Diffusion (MF-PID), a framework in which samples are promoted to interacting agents whose drift depends self-consistently on the evolving population density. We identify two analytically tractable regimes: a Linear-Quadratic-Gaussian (LQG) benchmark in which the infinite-dimensional mean-field system reduces to a finite set of Riccati and linear ODEs, and a Gaussian-mixture regime governed by a piecewise-constant protocol that preserves closed-form solvability. For a quadratic interaction potential with schedule ฮฒt and zero base drift we prove that the self-consistent MF guidance is the exact linear interpolant between initial and target global means -- a result that holds for arbitrary initial and target densities and any ฮฒt. Applied to demand-response control of energy systems, where agents aggregated into an ensemble are energy consumers (e.g. The energy saving is independent of the number of zones per building (d = 1-32 tested), confirming that the linear guidance formula broadcasts a single d-vector with O(d) communication and grows mildly in compute (sub-cubically for d 32, asymptotically O(d3) for d 1). Introduction Generative AI has been transformed by diffusion models, which frame sample generation as a stochastic process steered from noise to data [1-3]. A key structural feature of these models -- shared with other generative models, e.g. Similarly, stochastic optimal transport (SOT) and Schrรถdinger bridge formulations [6-8] cast distribution matching as an independent-particle path optimization, yielding tractable convolutions of Green functions but discarding inter-particle information; stochastic interpolants [9] construct flexible transport bridges between arbitrary densities via tunable continuous-time stochastic processes, recovering the Schrรถdinger bridge as a special limit -- again in an independent-particle framework.
cf5a019ae9c11b4be88213ce3f85d85c-Paper-Conference.pdf
Here, we focus on a more practical setting in object rearrangement,i.e., rearranging objects from shuffled layouts to a normative target distribution without explicit goal specification. However, it remains challenging for AI agents, as it is hard to describe the target distribution (goal specification) for reward engineering or collect expert trajectories as demonstrations. Hence, it is infeasible to directly employ reinforcement learning or imitation learning algorithms to address the task. This paper aims to search for a policy only with a set of examples from a target distribution instead of a handcrafted reward function. We employ the score-matching objectiveto train aTargetGradientField (TarGF),indicating a direction on each object to increase the likelihood of the target distribution.
Generative Stochastic Optimal Transport: Guided Harmonic Path-Integral Diffusion
We introduce Guided Harmonic Path-Integral Diffusion (GH-PID), a linearly-solvable framework for guided Stochastic Optimal Transport (SOT) with a hard terminal distribution and soft, application-driven path costs. A low-dimensional guidance protocol shapes the trajectory ensemble while preserving analytic structure: the forward and backward Kolmogorov equations remain linear, the optimal score admits an explicit Green-function ratio, and Gaussian-Mixture Model (GMM) terminal laws yield closed-form expressions. This enables stable sampling and differentiable protocol learning under exact terminal matching. We develop guidance-centric diagnostics -- path cost, centerline adherence, variance flow, and drift effort -- that make GH-PID an interpretable variational ansatz for empirical SOT. Three navigation scenarios illustrated in 2D: (i) Case A: hand-crafted protocols revealing how geometry and stiffness shape lag, curvature effects, and mode evolution; (ii) Case B: single-task protocol learning, where a PWC centerline is optimized to minimize integrated cost; (iii) Case C: multi-expert fusion, in which a commander reconciles competing expert/teacher trajectories and terminal beliefs through an exact product-of-experts law and learns a consensus protocol. Across all settings, GH-PID generates geometry-aware, trust-aware trajectories that satisfy the prescribed terminal distribution while systematically reducing integrated cost.