Goto

Collaborating Authors

 sample space


Score-based Generative Modeling through Stochastic Evolution Equations in Hilbert Spaces

Neural Information Processing Systems

Continuous-time score-based generative models consist of a pair of stochastic differential equations (SDEs)--a forward SDE that smoothly transitions data into a noise space and a reverse SDE that incrementally eliminates noise from a Gaussian prior distribution to generate data distribution samples--are intrinsically connected by the time-reversal theory on diffusion processes. In this paper, we investigate the use of stochastic evolution equations in Hilbert spaces, which expand the applicability of SDEs in two aspects: sample space and evolution operator, so they enable encompassing recent variations of diffusion models, such as generating functional data or replacing drift coefficients with image transformation. To this end, we derive a generalized time-reversal formula to build a bridge between probabilistic diffusion models and stochastic evolution equations and propose a score-based generative model called Hilbert Diffusion Model (HDM). Combining with Fourier neural operator, we verify the superiority of HDM for sampling functions from functional datasets with a power of kernel two-sample test of 4.2 on Quadratic, 0.2 on Melbourne, and 3.6 on Gridwatch, which outperforms existing diffusion models formulated in function spaces. Furthermore, the proposed method shows its strength in motion synthesis tasks by utilizing the Wiener process with values in Hilbert space.


Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network

Neural Information Processing Systems

Generative Flow Networks (GFlowNets), a class of generative models over discrete and structured sample spaces, have been previously applied to the problem of inferring the marginal posterior distribution over the directed acyclic graph (DAG) of a Bayesian Network, given a dataset of observations. Based on recent advances extending this framework to non-discrete sample spaces, we propose in this paper to approximate the joint posterior over not only the structure of a Bayesian Network, but also the parameters of its conditional probability distributions. We use a single GFlowNet whose sampling policy follows a two-phase process: the DAG is first generated sequentially one edge at a time, and then the corresponding parameters are picked once the full structure is known. Since the parameters are included in the posterior distribution, this leaves more flexibility for the local probability models of the Bayesian Network, making our approach applicable even to non-linear models parametrized by neural networks. We show that our method, called JSP-GFN, offers an accurate approximation of the joint posterior, while comparing favorably against existing methods on both simulated and real data.


DECOrrelated feature space partitioning for distributed sparse regression

Neural Information Processing Systems

Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space). While the majority of the literature focuses on sample space partitioning, feature space partitioning is more effective when p >> n. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension.




sample space), when extending them to

Neural Information Processing Systems

We thank all reviewers for their detailed constructive feedback and suggestions. Table B (below) demonstrates this empirically. Gumbel-Softmax has) with significantly less training time and resource consumption. These experiments show that when trained with Gumbel-CRF, the AR decoder outperforms REINFORCE. We will clarify this in the paper.



leads to

Neural Information Processing Systems

We thank all the reviewers for the valuable comments. Advantages of CSGLD over M-SGD: (i) CSGLD belongs to the class of adaptive biasing force algorithms and Empirically, we suggest to partition the sample space into a moderate number of subregions, e.g. Drawbacks of simulated annealing (SA) and replica exchange SGLD (reSGLD)/parallel tempering: SA can only be Q2. Missing baselines: We further compared CSGLD with CyclicalSGLD and reSGLD on an asymmetric mixture We will include the baselines and references in the next version. The gradient-vanishing problem in SGLD is not clear: Please refer to our reply to Q1 of Reviewer 1. Q1. Comments on bizarre peaks: A bizarre peak always indicates that there is a local minimum of the same energy in