Score Operator Newton transport

Chandramoorthy, Nisha, Schaefer, Florian, Marzouk, Youssef

arXiv.org Artificial Intelligence 

Generating samples from a complex (e.g., non-Gaussian, high-dimensional) probability distribution is a core computational challenge in diverse applications, ranging from computational statistics and machine learning to molecular simulation. A recurring setting is where the density ρ of the target distribution is specified up to a normalizing constant--for example, in Bayesian modeling, where ρ represents the posterior density. Here, evaluations of the score log ρ are often available as well, even for complex statistical models [Villa et al., 2021]. Alternatively, many new methods enable effective score estimation from data, without explicit density estimation; examples include score estimation from time series observations in chaotic dynamical systems [Chandramoorthy and Wang, 2022, Ni, 2020] and score-based modeling of image distributions [Song et al., 2020b,a]. In these settings, transport or "flow"-driven algorithms for generating samples have seen extensive success. The central idea is to construct a transport map from a simple, prescribed source distribution to the target distribution of interest. One class of transport approaches, e.g., as represented by variational inference with normalizing flows, involves constructing a parametric class of invertible maps and minimizing some statistical divergence between the pushforward (see Section 2) of the source by a member of this class and the target. A different, essentially nonparametric, class of transport approaches are based on particle systems, e.g., Stein variational gradient descent (SVGD)