Goto

Collaborating Authors

 logq


A Unified Framework for Data-Free One-Step Sampling via Wasserstein Gradient Flows

arXiv.org Machine Learning

We develop a unified theoretical framework for data-free one-step sampling from unnormalized target distributions based on Wasserstein gradient flows. For a broad class of standard f-divergence objectives, we show that the induced velocity field admits the universal form $\mathbf{V}(x)=w(r(x))\,β(x)$, where $β(x)=\nabla \log (p(x)/q(x))$ is shared across objectives and $w$ is determined solely by the choice of divergence. This decomposition shows that standard f-divergence drifts share the same asymptotic target distribution $p$ and differ primarily in how they redistribute transient repair effort across under-covered regions. To formalize this distinction, we derive a one-step regional-response theory for a soft under-coverage functional and obtain a compression--elasticity identity that links divergence choice to the geometry of mass transport into under-covered regions. We further extend the framework beyond the f-divergence family to the Log-Variance (LV) divergence, analyze how the reference distribution alters the resulting drift structure, and motivate a practical LV-inspired surrogate for data-free training. Based on this theory, we instantiate the framework with a KDE-based implementation and describe a complementary normalizing-flow route, enabling one-step inference after training. Experiments on multimodal Gaussian-mixture benchmarks are consistent with the theoretical predictions and demonstrate effective one-step sampling on these targets.


Entropy-based Training Methods for Scalable Neural Implicit Sampler

Neural Information Processing Systems

Efficiently sampling from un-normalized target distributions is a fundamental problem in scientific computing and machine learning. Traditional approaches such as Markov Chain Monte Carlo (MCMC) guarantee asymptotically unbiased samples from such distributions but suffer from computational inefficiency, particularly when dealing with high-dimensional targets, as they require numerous iterations to generate a batch of samples. In this paper, we introduce an efficient and scalable neural implicit sampler that overcomes these limitations. The implicit sampler can generate large batches of samples with low computational costs by leveraging a neural transformation that directly maps easily sampled latent vectors to target samples without the need for iterative procedures. To train the neural implicit samplers, we introduce two novel methods: the KL training method and the Fisher training method.


LiftingWeakSupervisionToStructuredPrediction

Neural Information Processing Systems

For labels taking values in a finite metric space, we introduce techniques new to weak supervision based on pseudo-Euclidean embeddings andtensor decompositions, providing anearly-consistent noise rate estimator.


Invariant Representations without Adversarial Training

Neural Information Processing Systems

We show that adversarial training is unnecessary and sometimes counter-productive; we instead cast invariant representation learning asasingle information-theoretic objectivethat can bedirectly optimized.



Correlated Uncertainty for Learning Dense Correspondences from Noisy Labels

Neural Information Processing Systems

Analternativeapproach isto predict instead adistributionp(ˆy|x) = Φˆy(x) over possible values of the annotationy. Theannotators are shown a set of points sampled randomly and uniformly over one of predefined body parts of aperson inan image.


Symmetry-inducedDisentanglementonGraphs

Neural Information Processing Systems

Disentanglementhasbeen formalized using a symmetry-centric notion for unstructured spaces, however, graphs have eluded a similarly rigorous treatment. We fill this gap with a new notionofconditional symmetryfordisentanglement, andleveragetoolsfromLie algebras toencode graph properties intosubgroups using suitable adaptations of generative models such as Variational Autoencoders.


Appendix: VariationalContinualBayesian Meta-Learning

Neural Information Processing Systems

In variational continual learning, the posterior distribution of interest is frequently intractable and approximation is required. We summarize the meta-training process of our VC-BML in algorithm 1. Moreover,we evaluate FTML onthe unseen tasks (i.e., tasks sampled from meta-test set) instead ofthe training tasksthattheoriginalFTMLused. It would be unfair to adopt the original initialization procedure in OSML. BOMVI [10]: In our experiments, we use variational inference to approximate the posterior of meta-parameters. E.3.2 Settings As the latent variables in this paper are meta-parameters and task-specific parameters, the dimensionality ofthelatent space isactually determined bythenumber ofparameters inthedeep neural network. In particular, we define a CNN architecture and present its details in Table 1.


SupplementaryMaterialFor StochasticMultipleTargetSamplingGradientDescent

Neural Information Processing Systems

By contrast, there isonly one quadratic programming problem solving inour proposed method, which significantly reduces time complexity, especially when the number of particles is high. The mean square error for each task and the average results are shown in Table 1. MT-SGD outperforms thesecond-best method, MOO-SVGD, with0.2251vs. However, on the one hand, computingU's entries can be accelerated in practice bycalculating theminparallel sincethereisnointeraction between themduring forwardpass. Allimagesareresizedto 64 64 3. Due tospace constraints, we report only the abbreviation ofeach task inthe main paper,their full namesarepresentedbelow.


Markovian with Christian Columbia chr Columbia d

Neural Information Processing Systems

Output: K ?, K ?. 1 for k=1,...,K do 2 Samplez[k] M( |z[k 1]; k 1, k 1) 3 Computes(z[k]; k 1)= r logq(z[k]; k 1) 4 Compute bgML( k 1)= r logp(z[k],x; k 1) 5 Set k= k 1+"ks(z[k]; k 1) 6 Set k= k 1+ kbgML( k 1) 7 end F hood (this obtained or WecompareMSCwith SMC-based [22] using [29].