barycenter
- North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
- North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.04)
- Asia > Middle East > Israel (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > France (0.04)
- North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
- (2 more...)
. Figure 1 m n 100 1000 10 29 4 s 33 6 s 50 8 1 min 9 1 min 100 15 1 min 24 2 min Table 2: Time to reach relative improvement 10
We thank the reviewers for their comments. We then address reviewer's comments individually (due to space limits please zoom in the tiny figures). For [18] we used Alg. 2 We thank the reviewer for the additional reference, which we will add to the paper. Gradient Descent) applied in parallel to multiple starting points. We thank R2 for the reference "Entropic regularization of continuous optimal transport problems".
Bayesian Multiple Multivariate Density-Density Regression
Nguyen, Khai, Ni, Yang, Mueller, Peter
We propose the first approach for multiple multivariate density-density regression (MDDR), making it possible to consider the regression of a multivariate density-valued response on multiple multivariate density-valued predictors. The core idea is to define a fitted distribution using a sliced Wasserstein barycenter (SWB) of push-forwards of the predictors and to quantify deviations from the observed response using the sliced Wasserstein (SW) distance. Regression functions, which map predictors' supports to the response support, and barycenter weights are inferred within a generalized Bayes framework, enabling principled uncertainty quantification without requiring a fully specified likelihood. The inference process can be seen as an instance of an inverse SWB problem. We establish theoretical guarantees, including the stability of the SWB under perturbations of marginals and barycenter weights, sample complexity of the generalized likelihood, and posterior consistency. For practical inference, we introduce a differentiable approximation of the SWB and a smooth reparameterization to handle the simplex constraint on barycenter weights, allowing efficient gradient-based MCMC sampling. We demonstrate MDDR in an application to inference for population-scale single-cell data. Posterior analysis under the MDDR model in this example includes inference on communication between multiple source/sender cell types and a target/receiver cell type. The proposed approach provides accurate fits, reliable predictions, and interpretable posterior estimates of barycenter weights, which can be used to construct sparse cell-cell communication networks.
- North America > United States > Texas > Travis County > Austin (0.40)
- Asia > Middle East > Israel (0.04)
Continuous Regularized Wasserstein Barycenters
Wasserstein barycenters provide a geometrically meaningful way to aggregate probability distributions, built on the theory of optimal transport. They are difficult to compute in practice, however, leading previous work to restrict their supports to finite sets of points. Leveraging a new dual formulation for the regularized Wasserstein barycenter problem, we introduce a stochastic algorithm that constructs a continuous approximation of the barycenter. We establish strong duality and use the corresponding primal-dual relationship to parametrize the barycenter implicitly using the dual potentials of regularized transport problems. The resulting problem can be solved with stochastic gradient descent, which yields an efficient online algorithm to approximate the barycenter of continuous distributions given sample access. We demonstrate the effectiveness of our approach and compare against previous work on synthetic examples and real-world applications.
Wasserstein Iterative Networks for Barycenter Estimation
Wasserstein barycenters have become popular due to their ability to represent the average of probability measures in a geometrically meaningful way. In this paper, we present an algorithm to approximate the Wasserstein-2 barycenters of continuous measures via a generative model. Previous approaches rely on regularization (entropic/quadratic) which introduces bias or on input convex neural networks which are not expressive enough for large-scale tasks. In contrast, our algorithm does not introduce bias and allows using arbitrary neural networks. In addition, based on the celebrity faces dataset, we construct Ave, celeba!
Convolutional Monge Mapping between EEG Datasets to Support Independent Component Labeling
Meek, Austin, Mendoza-Cardenas, Carlos H., Brockmeier, Austin J.
EEG recordings contain rich information about neural activity but are subject to artifacts, noise, and superficial differences due to sensors, amplifiers, and filtering. Independent component analysis and automatic labeling of independent components (ICs) enable artifact removal in EEG pipelines. Convolutional Monge Mapping Normalization (CMMN) is a recent tool used to achieve spectral conformity of EEG signals, which was shown to improve deep neural network approaches for sleep staging. Here we propose a novel extension of the CMMN method with two alternative approaches to computing the source reference spectrum the target signals are mapped to: (1) channel-averaged and $l_1$-normalized barycenter, and (2) a subject-to-subject mapping that finds the source subject with the closest spectrum to the target subject. Notably, our extension yields space-time separable filters that can be used to map between datasets with different numbers of EEG channels. We apply these filters in an IC classification task, and show significant improvement in recognizing brain versus non-brain ICs. Clinical relevance - EEG recordings are used in the diagnosis and monitoring of multiple neuropathologies, including epilepsy and psychosis. While EEG analysis can benefit from automating artifact removal through independent component analysis and labeling, differences in recording equipment and context (the presence of noise from electrical wiring and other devices) may impact the performance of machine learning models, but these differences can be minimized by appropriate spectral normalization through filtering.
- North America > United States > Delaware > New Castle County > Newark (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe (0.05)
Optimal Transportation and Alignment Between Gaussian Measures
Dandapanthula, Sanjit, Podkopaev, Aleksandr, Kasiviswanathan, Shiva Prasad, Ramdas, Aaditya, Goldfeld, Ziv
Optimal transport (OT) and Gromov-Wasserstein (GW) alignment provide interpretable geometric frameworks for comparing, transforming, and aggregating heterogeneous datasets -- tasks ubiquitous in data science and machine learning. Because these frameworks are computationally expensive, large-scale applications often rely on closed-form solutions for Gaussian distributions under quadratic cost. This work provides a comprehensive treatment of Gaussian, quadratic cost OT and inner product GW (IGW) alignment, closing several gaps in the literature to broaden applicability. First, we treat the open problem of IGW alignment between uncentered Gaussians on separable Hilbert spaces by giving a closed-form expression up to a quadratic optimization over unitary operators, for which we derive tight analytic upper and lower bounds. If at least one Gaussian measure is centered, the solution reduces to a fully closed-form expression, which we further extend to an analytic solution for the IGW barycenter between centered Gaussians. We also present a reduction of Gaussian multimarginal OT with pairwise quadratic costs to a tractable optimization problem and provide an efficient algorithm to solve it using a rank-deficiency constraint. To demonstrate utility, we apply our results to knowledge distillation and heterogeneous clustering on synthetic and real-world datasets.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Michigan (0.04)
- (2 more...)
Parallel Streaming Wasserstein Barycenters
Efficiently aggregating data from different sources is a challenging problem, particularly when samples from each source are distributed differently. These differences can be inherent to the inference task or present for other reasons: sensors in a sensor network may be placed far apart, affecting their individual measurements. Conversely, it is computationally advantageous to split Bayesian inference tasks across subsets of data, but data need not be identically distributed across subsets. One principled way to fuse probability distributions is via the lens of optimal transport: the Wasserstein barycenter is a single distribution that summarizes a collection of input measures while respecting their geometry. However, computing the barycenter scales poorly and requires discretization of all input distributions and the barycenter itself.