AITopics

2502.04468

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceDec-3-2024

Plug-and-Play Half-Quadratic Splitting for Ptychography

Denker, Alexander, Hertrich, Johannes, Kereta, Zeljko, Cipiccia, Silvia, Erin, Ecem, Arridge, Simon

Ptychography is a coherent diffraction imaging method that uses phase retrieval techniques to reconstruct complex-valued images. It achieves this by sequentially illuminating overlapping regions of a sample with a coherent beam and recording the diffraction pattern. Although this addresses traditional imaging system challenges, it is computationally intensive and highly sensitive to noise, especially with reduced illumination overlap. Data-driven regularisation techniques have been applied in phase retrieval to improve reconstruction quality. In particular, plug-and-play (PnP) offers flexibility by integrating data-driven denoisers as implicit priors. In this work, we propose a half-quadratic splitting framework for using PnP and other data-driven priors for ptychography. We evaluate our method both on natural images and real test objects to validate its effectiveness for ptychographic image reconstruction.

algorithm, artificial intelligence, machine learning, (15 more...)

2412.02548

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Sensing and Signal Processing > Image Processing (0.47)

arXiv.org Machine LearningNov-11-2024

Generative Feature Training of Thin 2-Layer Networks

Hertrich, Johannes, Neumayer, Sebastian

We consider the approximation of functions by 2-layer neural networks with a small number of hidden weights based on the squared loss and small datasets. Due to the highly non-convex energy landscape, gradient-based training often suffers from local minima. As a remedy, we initialize the hidden weights with samples from a learned proposal distribution, which we parameterize as a deep generative model. To train this model, we exploit the fact that with fixed hidden weights, the optimal output weights solve a linear equation. After learning the generative model, we refine the sampled weights with a gradient-based post-processing in the latent space. Here, we also include a regularization scheme to counteract potential noise. Finally, we demonstrate the effectiveness of our approach by numerical examples.

approximation, artificial intelligence, machine learning, (13 more...)

2411.06848

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Machine LearningOct-2-2024

Fast Summation of Radial Kernels via QMC Slicing

Hertrich, Johannes, Jahn, Tim, Quellmalz, Michael

The fast computation of large kernel sums is a challenging task, which arises as a subproblem in any kernel method. We approach the problem by slicing, which relies on random projections to one-dimensional subspaces and fast Fourier summation. We prove bounds for the slicing error and propose a quasi-Monte Carlo (QMC) approach for selecting the projections based on spherical quadrature rules. Numerical examples demonstrate that our QMC-slicing approach significantly outperforms existing methods like (QMC-)random Fourier features, orthogonal Fourier features or non-QMC slicing on standard test datasets.

artificial intelligence, machine learning, mathematics of computing, (16 more...)

2410.01316

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Mathematics of Computing (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

arXiv.org Artificial IntelligenceFeb-5-2024

Mixed Noise and Posterior Estimation with Conditional DeepGEM

Hagemann, Paul, Hertrich, Johannes, Casfor, Maren, Heidenreich, Sebastian, Steidl, Gabriele

In numerous healthcare and other contemporary applications, the variables of primary interest are obtained through indirect measurements, such as in the case of Magnetic Resonance Imaging (MRI) and Computed Tomography (CT). For some of these applications, the reliability of the results is of particular importance. The accuracy and trustworthiness of the outcomes obtained through indirect measurements are significantly influenced by two critical factors: the degree of uncertainty associated with the measuring instrument and the appropriateness of the (forward) model used for the reconstruction of the parameters of interest (measurand). In this paper, we consider Bayesian inversion to obtain the measurand from signals measured by the instrument and a noise model that mimics both the instrument noise and the error of the forward model.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2402.02964

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Artificial IntelligenceJan-16-2024

Fast Kernel Summation in High Dimensions via Slicing and Fourier Transforms

Hertrich, Johannes

Kernel-based methods are heavily used in machine learning. However, they suffer from $O(N^2)$ complexity in the number $N$ of considered data points. In this paper, we propose an approximation procedure, which reduces this complexity to $O(N)$. Our approach is based on two ideas. First, we prove that any radial kernel with analytic basis function can be represented as sliced version of some one-dimensional kernel and derive an analytic formula for the one-dimensional counterpart. It turns out that the relation between one- and $d$-dimensional kernels is given by a generalized Riemann-Liouville fractional integral. Hence, we can reduce the $d$-dimensional kernel summation to a one-dimensional setting. Second, for solving these one-dimensional problems efficiently, we apply fast Fourier summations on non-equispaced data, a sorting algorithm or a combination of both. Due to its practical importance we pay special attention to the Gaussian kernel, where we show a dimension-independent error bound and represent its one-dimensional counterpart via a closed-form Fourier transform. We provide a run time comparison and error estimate of our fast kernel summations.

artificial intelligence, data quality, machine learning, (17 more...)

2401.0826

Genre:

Research Report (0.64)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Quality > Data Transformation (0.86)

arXiv.org Artificial IntelligenceDec-27-2023

Learning from small data sets: Patch-based regularizers in inverse problems for image reconstruction

Piening, Moritz, Altekrüger, Fabian, Hertrich, Johannes, Hagemann, Paul, Walther, Andrea, Steidl, Gabriele

The solution of inverse problems is of fundamental interest in medical and astronomical imaging, geophysics as well as engineering and life sciences. Recent advances were made by using methods from machine learning, in particular deep neural networks. Most of these methods require a huge amount of (paired) data and computer capacity to train the networks, which often may not be available. Our paper addresses the issue of learning from small data sets by taking patches of very few images into account. We focus on the combination of model-based and data-driven methods by approximating just the image prior, also known as regularizer in the variational model. We review two methodically different approaches, namely optimizing the maximum log-likelihood of the patch distribution, and penalizing Wasserstein-like discrepancies of whole empirical patch distributions. From the point of view of Bayesian inverse problems, we show how we can achieve uncertainty quantification by approximating the posterior using Langevin Monte Carlo methods. We demonstrate the power of the methods in computed tomography, image super-resolution, and inpainting. Indeed, the approach provides also high-quality results in zero-shot super-resolution, where only a low-resolution image is available. The paper is accompanied by a GitHub repository containing implementations of all methods as well as data examples so that the reader can get their own insight into the performance.

artificial intelligence, machine learning, survey article, (19 more...)

2312.16611

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Machine LearningOct-4-2023

Generative Sliced MMD Flows with Riesz Kernels

Hertrich, Johannes, Wald, Christian, Altekrüger, Fabian, Hagemann, Paul

Maximum mean discrepancy (MMD) flows suffer from high computational costs in large scale computations. In this paper, we show that MMD flows with Riesz kernels $K(x,y) = - \Vert x-y\Vert^r$, $r \in (0,2)$ have exceptional properties which allow their efficient computation. We prove that the MMD of Riesz kernels, which is also known as energy distance, coincides with the MMD of their sliced version. As a consequence, the computation of gradients of MMDs can be performed in the one-dimensional setting. Here, for $r=1$, a simple sorting algorithm can be applied to reduce the complexity from $O(MN+N^2)$ to $O((M+N)\log(M+N))$ for two measures with $M$ and $N$ support points. As another interesting follow-up result, the MMD of compactly supported measures can be estimated from above and below by the Wasserstein-1 distance. For the implementations we approximate the gradient of the sliced MMD by using only a finite number $P$ of slices. We show that the resulting error has complexity $O(\sqrt{d/P})$, where $d$ is the data dimension. These results enable us to train generative models by approximating MMD gradient flows by neural networks even for image applications. We demonstrate the efficiency of our model by image generation on MNIST, FashionMNIST and CIFAR10.

artificial intelligence, gradient flow, machine learning, (12 more...)

2305.11463

Country:

Europe > Germany (0.14)
North America > Canada (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningOct-4-2023

Posterior Sampling Based on Gradient Flows of the MMD with Negative Distance Kernel

Hagemann, Paul, Hertrich, Johannes, Altekrüger, Fabian, Beinert, Robert, Chemseddine, Jannis, Steidl, Gabriele

We propose conditional flows of the maximum mean discrepancy (MMD) with the negative distance kernel for posterior sampling and conditional generative modeling. This MMD, which is also known as energy distance, has several advantageous properties like efficient computation via slicing and sorting. We approximate the joint distribution of the ground truth and the observations using discrete Wasserstein gradient flows and establish an error bound for the posterior distributions. Further, we prove that our particle flow is indeed a Wasserstein gradient flow of an appropriate functional. The power of our method is demonstrated by numerical examples including conditional image generation and inverse problems like superresolution, inpainting and computed tomography in low-dose and limited-angle settings.

artificial intelligence, machine learning, wasserstein gradient flow, (12 more...)

2310.03054

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.40)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

arXiv.org Artificial IntelligenceJun-2-2023

Neural Wasserstein Gradient Flows for Maximum Mean Discrepancies with Riesz Kernels

Altekrüger, Fabian, Hertrich, Johannes, Steidl, Gabriele

In this paper we contribute to For approximating Wasserstein gradient flows for more general the understanding of such flows. We propose functionals, a backward discretization scheme in time, to approximate the backward scheme of Jordan, known as Jordan-Kinderlehrer-Otto (JKO) scheme (Giorgi, Kinderlehrer and Otto for computing such Wasserstein 1993; Jordan et al., 1998) can be used. Its basic idea is to gradient flows as well as a forward scheme discretize the whole flow in time by applying iteratively for so-called Wasserstein steepest descent flows the Wasserstein proximal operator with respect to F. In by neural networks (NNs). Since we cannot restrict case of absolutely continuous measures, Brenier's theorem ourselves to absolutely continuous measures, (Brenier, 1987) can be applied to rewrite this operator via we have to deal with transport plans and velocity transport maps having convex potentials and to learn these plans instead of usual transport maps and velocity transport maps (Fan et al., 2022) or their potentials (Alvarez-fields. Indeed, we approximate the disintegration Melis et al., 2022; Bunne et al., 2022; Mokrov et al., 2021) of both plans by generative NNs which are by neural networks (NNs). In most papers, the objective learned with respect to appropriate loss functions.

artificial intelligence, machine learning, neural wasserstein gradient flow, (14 more...)

2301.11624

Country:

Europe (0.93)
Asia > Middle East > Jordan (0.64)
North America > United States (0.46)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)