Goto

Collaborating Authors

 1-wasserstein distance


Universality of Many-body Projected Ensemble for Learning Quantum Data Distribution

Tran, Quoc Hoan, Chinzei, Koki, Endo, Yasuhiro, Oshima, Hirotaka

arXiv.org Machine Learning

Recent advancements highlight the pivotal role of quantum machine learning (QML) [4, 13] in processing quantum data derived from quantum systems [14]. A fundamental task in QML is generating quantum data by learning the underlying distribution, essential for understanding quantum systems, synthesizing new samples, and advancing applications in quantum chemistry and materials science. However, extending classical generative approaches to quantum data presents significant challenges, as quantum distributions exhibit superposition, entanglement, and non-locality that classical models struggle to replicate efficiently. Quantum generative models such as quantum generative adversarial networks [24, 42] and quantum variational autoencoders [20, 38] can be used to prepare a fixed single quantum state [21, 28, 37], but are inefficient for generating ensembles of quantum states [3] due to the need for training deep parameterized quantum circuits (PQCs). The quantum denoising diffusion probabilistic model [40] offers a promising approach that employs intermediate training steps to smoothly interpolate between the target distribution and noise, thereby enabling efficient training.


Asymptotic Guarantees for Generative Modeling Based on the Smooth Wasserstein Distance

Neural Information Processing Systems

Minimum distance estimation (MDE) gained recent attention as a formulation of (implicit) generative modeling. It considers minimizing, over model parameters, a statistical distance between the empirical data distribution and the model. This formulation lends itself well to theoretical analysis, but typical results are hindered by the curse of dimensionality. To overcome this and devise a scalable finite-sample statistical MDE theory, we adopt the framework of smooth 1-Wasserstein distance (SWD) $\mathsf{W}_1^{(\sigma)}$. The SWD was recently shown to preserve the metric and topological structure of classic Wasserstein distances, while enjoying dimension-free empirical convergence rates. In this work, we conduct a thorough statistical study of the minimum smooth Wasserstein estimators (MSWEs), first proving the estimator's measurability and asymptotic consistency. We then characterize the limit distribution of the optimal model parameters and their associated minimal SWD. These results imply an $O(n^{-1/2})$ generalization bound for generative modeling based on MSWE, which holds in arbitrary dimension. Our main technical tool is a novel high-dimensional limit distribution result for empirical $\mathsf{W}_1^{(\sigma)}$. The characterization of a nondegenerate limit stands in sharp contrast with the classic empirical 1-Wasserstein distance, for which a similar result is known only in the one-dimensional case. The validity of our theory is supported by empirical results, posing the SWD as a potent tool for learning and inference in high dimensions.




From Kicking to Causality: Simulating Infant Agency Detection with a Robust Intrinsic Reward

Xu, Xia, Triesch, Jochen

arXiv.org Artificial Intelligence

While human infants robustly discover their own causal efficacy, standard reinforcement learning agents remain brittle, as their reliance on correlation-based rewards fails in noisy, ecologically valid scenarios. To address this, we introduce the Causal Action Influence Score (CAIS), a novel intrinsic reward rooted in causal inference. CAIS quantifies an action's influence by measuring the 1-Wasserstein distance between the learned distribution of sensory outcomes conditional on that action, $p(h|a)$, and the baseline outcome distribution, $p(h)$. This divergence provides a robust reward that isolates the agent's causal impact from confounding environmental noise. We test our approach in a simulated infant-mobile environment where correlation-based perceptual rewards fail completely when the mobile is subjected to external forces. In stark contrast, CAIS enables the agent to filter this noise, identify its influence, and learn the correct policy. Furthermore, the high-quality predictive model learned for CAIS allows our agent, when augmented with a surprise signal, to successfully reproduce the "extinction burst" phenomenon. We conclude that explicitly inferring causality is a crucial mechanism for developing a robust sense of agency, offering a psychologically plausible framework for more adaptive autonomous systems.


Non-Reversible Langevin Algorithms for Constrained Sampling

Du, Hengrong, Feng, Qi, Tu, Changwei, Wang, Xiaoyu, Zhu, Lingjiong

arXiv.org Artificial Intelligence

We consider the constrained sampling problem where the goal is to sample from a target distribution on a constrained domain. We propose skew-reflected non-reversible Langevin dynamics (SRNLD), a continuous-time stochastic differential equation with skew-reflected boundary. We obtain non-asymptotic convergence rate of SRNLD to the target distribution in both total variation and 1-Wasserstein distances. By breaking reversibility, we show that the convergence is faster than the special case of the reversible dynamics. Based on the discretization of SRNLD, we propose skew-reflected non-reversible Langevin Monte Carlo (SRNLMC), and obtain non-asymptotic discretization error from SRNLD, and convergence guarantees to the target distribution in 1-Wasserstein distance. We show better performance guarantees than the projected Langevin Monte Carlo in the literature that is based on the reversible dynamics. Numerical experiments are provided for both synthetic and real datasets to show efficiency of the proposed algorithms.


WaKA: Data Attribution using K-Nearest Neighbors and Membership Privacy Principles

Mesana, Patrick, Bénesse, Clément, Lautraite, Hadrien, Caporossi, Gilles, Gambs, Sébastien

arXiv.org Artificial Intelligence

In this paper, we introduce WaKA (Wasserstein K-nearest-neighbors Attribution), a novel attribution method that leverages principles from the LiRA (Likelihood Ratio Attack) framework and k-nearest neighbors classifiers (k-NN). WaKA efficiently measures the contribution of individual data points to the model's loss distribution, analyzing every possible k-NN that can be constructed using the training set, without requiring to sample subsets of the training set. WaKA is versatile and can be used a posteriori as a membership inference attack (MIA) to assess privacy risks or a priori for privacy influence measurement and data valuation. Thus, WaKA can be seen as bridging the gap between data attribution and membership inference attack (MIA) by providing a unified framework to distinguish between a data point's value and its privacy risk. For instance, we have shown that self-attribution values are more strongly correlated with the attack success rate than the contribution of a point to the model generalization. WaKA's different usage were also evaluated across diverse real-world datasets, demonstrating performance very close to LiRA when used as an MIA on k-NN classifiers, but with greater computational efficiency. Additionally, WaKA shows greater robustness than Shapley Values for data minimization tasks (removal or addition) on imbalanced datasets.


Asymptotic Guarantees for Generative Modeling Based on the Smooth Wasserstein Distance

Neural Information Processing Systems

Minimum distance estimation (MDE) gained recent attention as a formulation of (implicit) generative modeling. It considers minimizing, over model parameters, a statistical distance between the empirical data distribution and the model. This formulation lends itself well to theoretical analysis, but typical results are hindered by the curse of dimensionality. To overcome this and devise a scalable finite-sample statistical MDE theory, we adopt the framework of smooth 1-Wasserstein distance (SWD) \mathsf{W}_1 {(\sigma)} . The SWD was recently shown to preserve the metric and topological structure of classic Wasserstein distances, while enjoying dimension-free empirical convergence rates.


Adaptive Learning of the Latent Space of Wasserstein Generative Adversarial Networks

Qiu, Yixuan, Gao, Qingyi, Wang, Xiao

arXiv.org Machine Learning

Generative models based on latent variables, such as generative adversarial networks (GANs) and variational auto-encoders (VAEs), have gained lots of interests due to their impressive performance in many fields. However, many data such as natural images usually do not populate the ambient Euclidean space but instead reside in a lower-dimensional manifold. Thus an inappropriate choice of the latent dimension fails to uncover the structure of the data, possibly resulting in mismatch of latent representations and poor generative qualities. Towards addressing these problems, we propose a novel framework called the latent Wasserstein GAN (LWGAN) that fuses the Wasserstein auto-encoder and the Wasserstein GAN so that the intrinsic dimension of the data manifold can be adaptively learned by a modified informative latent distribution. We prove that there exist an encoder network and a generator network in such a way that the intrinsic dimension of the learned encoding distribution is equal to the dimension of the data manifold. We theoretically establish that our estimated intrinsic dimension is a consistent estimate of the true dimension of the data manifold. Meanwhile, we provide an upper bound on the generalization error of LWGAN, implying that we force the synthetic data distribution to be similar to the real data distribution from a population perspective. Comprehensive empirical experiments verify our framework and show that LWGAN is able to identify the correct intrinsic dimension under several scenarios, and simultaneously generate high-quality synthetic data by sampling from the learned latent distribution.


Injective Sliced-Wasserstein embedding for weighted sets and point clouds

Amir, Tal, Dym, Nadav

arXiv.org Artificial Intelligence

We present the $\textit{Sliced Wasserstein Embedding}$ $\unicode{x2014}$ a novel method to embed multisets and distributions over $\mathbb{R}^d$ into Euclidean space. Our embedding is injective and approximately preserves the Sliced Wasserstein distance. Moreover, when restricted to multisets, it is bi-Lipschitz. We also prove that it is $\textit{impossible}$ to embed distributions over $\mathbb{R}^d$ into a Euclidean space in a bi-Lipschitz manner, even under the assumption that their support is bounded and finite. We demonstrate empirically that our embedding offers practical advantage in learning tasks over existing methods for handling multisets.