Plotting

 Steidl, Gabriele


Transfer Operators from Batches of Unpaired Points via Entropic Transport Kernels

arXiv.org Machine Learning

In this paper, we are concerned with estimating the joint probability of random variables $X$ and $Y$, given $N$ independent observation blocks $(\boldsymbol{x}^i,\boldsymbol{y}^i)$, $i=1,\ldots,N$, each of $M$ samples $(\boldsymbol{x}^i,\boldsymbol{y}^i) = \bigl((x^i_j, y^i_{\sigma^i(j)}) \bigr)_{j=1}^M$, where $\sigma^i$ denotes an unknown permutation of i.i.d. sampled pairs $(x^i_j,y_j^i)$, $j=1,\ldots,M$. This means that the internal ordering of the $M$ samples within an observation block is not known. We derive a maximum-likelihood inference functional, propose a computationally tractable approximation and analyze their properties. In particular, we prove a $\Gamma$-convergence result showing that we can recover the true density from empirical approximations as the number $N$ of blocks goes to infinity. Using entropic optimal transport kernels, we model a class of hypothesis spaces of density functions over which the inference functional can be minimized. This hypothesis class is particularly suited for approximate inference of transfer operators from data. We solve the resulting discrete minimization problem by a modification of the EMML algorithm to take addional transition probability constraints into account and prove the convergence of this algorithm. Proof-of-concept examples demonstrate the potential of our method.


Wasserstein Gradient Flows for Moreau Envelopes of f-Divergences in Reproducing Kernel Hilbert Spaces

arXiv.org Artificial Intelligence

Most commonly used $f$-divergences of measures, e.g., the Kullback-Leibler divergence, are subject to limitations regarding the support of the involved measures. A remedy consists of regularizing the $f$-divergence by a squared maximum mean discrepancy (MMD) associated with a characteristic kernel $K$. In this paper, we use the so-called kernel mean embedding to show that the corresponding regularization can be rewritten as the Moreau envelope of some function in the reproducing kernel Hilbert space associated with $K$. Then, we exploit well-known results on Moreau envelopes in Hilbert spaces to prove properties of the MMD-regularized $f$-divergences and, in particular, their gradients. Subsequently, we use our findings to analyze Wasserstein gradient flows of MMD-regularized $f$-divergences. Finally, we consider Wasserstein gradient flows starting from empirical measures and provide proof-of-the-concept numerical examples with Tsallis-$\alpha$ divergences.


Mixed Noise and Posterior Estimation with Conditional DeepGEM

arXiv.org Artificial Intelligence

In numerous healthcare and other contemporary applications, the variables of primary interest are obtained through indirect measurements, such as in the case of Magnetic Resonance Imaging (MRI) and Computed Tomography (CT). For some of these applications, the reliability of the results is of particular importance. The accuracy and trustworthiness of the outcomes obtained through indirect measurements are significantly influenced by two critical factors: the degree of uncertainty associated with the measuring instrument and the appropriateness of the (forward) model used for the reconstruction of the parameters of interest (measurand). In this paper, we consider Bayesian inversion to obtain the measurand from signals measured by the instrument and a noise model that mimics both the instrument noise and the error of the forward model.


Manifold GCN: Diffusion-based Convolutional Neural Network for Manifold-valued Graphs

arXiv.org Artificial Intelligence

We propose two graph neural network layers for graphs with features in a Riemannian manifold. First, based on a manifold-valued graph diffusion equation, we construct a diffusion layer that can be applied to an arbitrary number of nodes and graph connectivity patterns. Second, we model a tangent multilayer perceptron by transferring ideas from the vector neuron framework to our general setting. Both layers are equivariant with respect to node permutations and isometries of the feature manifold. These properties have been shown to lead to a beneficial inductive bias in many deep learning tasks. Numerical examples on synthetic data as well as on triangle meshes of the right hippocampus to classify Alzheimer's disease demonstrate the very good performance of our layers.


Learning from small data sets: Patch-based regularizers in inverse problems for image reconstruction

arXiv.org Artificial Intelligence

The solution of inverse problems is of fundamental interest in medical and astronomical imaging, geophysics as well as engineering and life sciences. Recent advances were made by using methods from machine learning, in particular deep neural networks. Most of these methods require a huge amount of (paired) data and computer capacity to train the networks, which often may not be available. Our paper addresses the issue of learning from small data sets by taking patches of very few images into account. We focus on the combination of model-based and data-driven methods by approximating just the image prior, also known as regularizer in the variational model. We review two methodically different approaches, namely optimizing the maximum log-likelihood of the patch distribution, and penalizing Wasserstein-like discrepancies of whole empirical patch distributions. From the point of view of Bayesian inverse problems, we show how we can achieve uncertainty quantification by approximating the posterior using Langevin Monte Carlo methods. We demonstrate the power of the methods in computed tomography, image super-resolution, and inpainting. Indeed, the approach provides also high-quality results in zero-shot super-resolution, where only a low-resolution image is available. The paper is accompanied by a GitHub repository containing implementations of all methods as well as data examples so that the reader can get their own insight into the performance.


Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation

arXiv.org Machine Learning

Score-based diffusion models (SBDM) have recently emerged as state-of-the-art approaches for image generation. Existing SBDMs are typically formulated in a finite-dimensional setting, where images are considered as tensors of finite size. This paper develops SBDMs in the infinite-dimensional setting, that is, we model the training data as functions supported on a rectangular domain. Besides the quest for generating images at ever higher resolution, our primary motivation is to create a well-posed infinite-dimensional learning problem so that we can discretize it consistently on multiple resolution levels. We thereby intend to obtain diffusion models that generalize across different resolution levels and improve the efficiency of the training process. We demonstrate how to overcome two shortcomings of current SBDM approaches in the infinite-dimensional setting. First, we modify the forward process to ensure that the latent distribution is well-defined in the infinite-dimensional setting using the notion of trace class operators. We derive the reverse processes for finite approximations. Second, we illustrate that approximating the score function with an operator network is beneficial for multilevel training. After deriving the convergence of the discretization and the approximation of multilevel training, we implement an infinite-dimensional SBDM approach and show the first promising results on MNIST and Fashion-MNIST, underlining our developed theory.


Conditional Generative Models are Provably Robust: Pointwise Guarantees for Bayesian Inverse Problems

arXiv.org Artificial Intelligence

Conditional generative models became a very powerful tool to sample from Bayesian inverse problem posteriors. It is well-known in classical Bayesian literature that posterior measures are quite robust with respect to perturbations of both the prior measure and the negative log-likelihood, which includes perturbations of the observations. However, to the best of our knowledge, the robustness of conditional generative models with respect to perturbations of the observations has not been investigated yet. In this paper, we prove for the first time that appropriately learned conditional generative models provide robust results for single observations.


Posterior Sampling Based on Gradient Flows of the MMD with Negative Distance Kernel

arXiv.org Machine Learning

We propose conditional flows of the maximum mean discrepancy (MMD) with the negative distance kernel for posterior sampling and conditional generative modeling. This MMD, which is also known as energy distance, has several advantageous properties like efficient computation via slicing and sorting. We approximate the joint distribution of the ground truth and the observations using discrete Wasserstein gradient flows and establish an error bound for the posterior distributions. Further, we prove that our particle flow is indeed a Wasserstein gradient flow of an appropriate functional. The power of our method is demonstrated by numerical examples including conditional image generation and inverse problems like superresolution, inpainting and computed tomography in low-dose and limited-angle settings.


Neural Wasserstein Gradient Flows for Maximum Mean Discrepancies with Riesz Kernels

arXiv.org Artificial Intelligence

In this paper we contribute to For approximating Wasserstein gradient flows for more general the understanding of such flows. We propose functionals, a backward discretization scheme in time, to approximate the backward scheme of Jordan, known as Jordan-Kinderlehrer-Otto (JKO) scheme (Giorgi, Kinderlehrer and Otto for computing such Wasserstein 1993; Jordan et al., 1998) can be used. Its basic idea is to gradient flows as well as a forward scheme discretize the whole flow in time by applying iteratively for so-called Wasserstein steepest descent flows the Wasserstein proximal operator with respect to F. In by neural networks (NNs). Since we cannot restrict case of absolutely continuous measures, Brenier's theorem ourselves to absolutely continuous measures, (Brenier, 1987) can be applied to rewrite this operator via we have to deal with transport plans and velocity transport maps having convex potentials and to learn these plans instead of usual transport maps and velocity transport maps (Fan et al., 2022) or their potentials (Alvarez-fields. Indeed, we approximate the disintegration Melis et al., 2022; Bunne et al., 2022; Mokrov et al., 2021) of both plans by generative NNs which are by neural networks (NNs). In most papers, the objective learned with respect to appropriate loss functions.


PatchNR: Learning from Very Few Images by Patch Normalizing Flow Regularization

arXiv.org Artificial Intelligence

Learning neural networks using only few available information is an important ongoing research topic with tremendous potential for applications. In this paper, we introduce a powerful regularizer for the variational modeling of inverse problems in imaging. Our regularizer, called patch normalizing flow regularizer (patchNR), involves a normalizing flow learned on small patches of very few images. In particular, the training is independent of the considered inverse problem such that the same regularizer can be applied for different forward operators acting on the same class of images. By investigating the distribution of patches versus those of the whole image class, we prove that our model is indeed a MAP approach. Numerical examples for low-dose and limited-angle computed tomography (CT) as well as superresolution of material images demonstrate that our method provides very high quality results. The training set consists of just six images for CT and one image for superresolution. Finally, we combine our patchNR with ideas from internal learning for performing superresolution of natural images directly from the low-resolution observation without knowledge of any high-resolution image.