AITopics

2302.02766

Country:

North America > United States (1.00)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMay-26-2023

Error Bounds for Flow Matching Methods

Benton, Joe, Deligiannidis, George, Doucet, Arnaud

Score-based generative models are a popular class of generative modelling techniques relying on stochastic differential equations (SDE). From their inception, it was realized that it was also possible to perform generation using ordinary differential equations (ODE) rather than SDE. This led to the introduction of the probability flow ODE approach and denoising diffusion implicit models. Flow matching methods have recently further extended these ODE-based approaches and approximate a flow between two arbitrary probability distributions. Previous work derived bounds on the approximation error of diffusion models under the stochastic sampling regime, given assumptions on the $L^2$ loss. We present error bounds for the flow matching procedure using fully deterministic sampling, assuming an $L^2$ bound on the approximation error and a certain regularity condition on the data distributions.

artificial intelligence, international conference, machine learning, (14 more...)

2305.1686

Country: Europe > United Kingdom (0.29)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMay-11-2023

From Denoising Diffusions to Denoising Markov Models

Benton, Joe, Shi, Yuyang, De Bortoli, Valentin, Deligiannidis, George, Doucet, Arnaud

Denoising diffusions are state-of-the-art generative models exhibiting remarkable empirical performance. They work by diffusing the data distribution into a Gaussian distribution and then learning to reverse this noising process to obtain synthetic datapoints. The denoising diffusion relies on approximations of the logarithmic derivatives of the noised data densities using score matching. Such models can also be used to perform approximate posterior simulation when one can only sample from the prior and likelihood. We propose a unifying framework generalising this approach to a wide class of spaces and leading to an original extension of score matching. We illustrate the resulting models on various applications.

artificial intelligence, machine learning, objective, (18 more...)

2211.03595

Country:

North America > United States (0.27)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

arXiv.org Artificial IntelligenceApr-4-2023

Generalisation under gradient descent via deterministic PAC-Bayes

Clerico, Eugenio, Farghly, Tyler, Deligiannidis, George, Guedj, Benjamin, Doucet, Arnaud

We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows. Contrary to standard practice in the PAC-Bayesian setting, our result applies to optimisation algorithms that are deterministic, without requiring any de-randomisation step. Our bounds are fully computable, depending on the density of the initial distribution and the Hessian of the training objective over the trajectory. We show that our framework can be applied to a variety of iterative optimisation algorithms, including stochastic gradient descent (SGD), momentum-based schemes, and damped Hamiltonian dynamics.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2209.02525

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Artificial IntelligenceJan-19-2023

A Multi-Resolution Framework for U-Nets with Applications to Hierarchical VAEs

Falck, Fabian, Williams, Christopher, Danks, Dominic, Deligiannidis, George, Yau, Christopher, Holmes, Chris, Doucet, Arnaud, Willetts, Matthew

U-Net architectures are ubiquitous in state-of-the-art deep learning, however their regularisation properties and relationship to wavelets are understudied. In this paper, we formulate a multi-resolution framework which identifies U-Nets as finite-dimensional truncations of models on an infinite-dimensional function space. We provide theoretical results which prove that average pooling corresponds to projection within the space of square-integrable functions and show that U-Nets with average pooling implicitly learn a Haar wavelet basis representation of the data. We then leverage our framework to identify state-of-the-art hierarchical VAEs (HVAEs), which have a U-Net architecture, as a type of two-step forward Euler discretisation of multi-resolution diffusion processes which flow from a point mass, introducing sampling instabilities. We also demonstrate that HVAEs learn a representation of time which allows for improved parameter efficiency through weight-sharing. We use this observation to achieve state-of-the-art HVAE performance with half the number of parameters of existing models, exploiting the properties of our continuous-time formulation.

artificial intelligence, hvae, machine learning, (17 more...)

2301.08187

Country:

Europe > United Kingdom (0.45)
North America > United States (0.27)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Information Technology (0.67)
Government > Regional Government (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceOct-14-2022

A Continuous Time Framework for Discrete Denoising Models

Campbell, Andrew, Benton, Joe, De Bortoli, Valentin, Rainforth, Tom, Deligiannidis, George, Doucet, Arnaud

We provide the first complete continuous time framework for denoising diffusion models of discrete data. This is achieved by formulating the forward noising process and corresponding reverse time generative process as Continuous Time Markov Chains (CTMCs). The model can be efficiently trained using a continuous time version of the ELBO. We simulate the high dimensional CTMC using techniques developed in chemical physics and exploit our continuous time framework to derive high performance samplers that we show can outperform discrete time methods for discrete data. The continuous time treatment also enables us to derive a novel theoretical result bounding the error between the generated sample distribution and the true data distribution.

artificial intelligence, dimension, machine learning, (17 more...)

2205.14987

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (0.63)

Industry: Media (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Machine LearningJun-30-2022

Chained Generalisation Bounds

Clerico, Eugenio, Shidani, Amitis, Deligiannidis, George, Doucet, Arnaud

This work discusses how to derive upper bounds for the expected generalisation error of supervised learning algorithms by means of the chaining technique. By developing a general theoretical framework, we establish a duality between generalisation bounds based on the regularity of the loss function, and their chained counterparts, which can be obtained by lifting the regularity assumption from the loss onto its gradient. This allows us to re-derive the chaining mutual information bound from the literature, and to obtain novel chained information-theoretic generalisation bounds, based on the Wasserstein distance and other probability metrics. We show on some toy examples that the chained generalisation bound can be significantly tighter than its standard counterpart, particularly when the distribution of the hypotheses selected by the algorithm is very concentrated.

artificial intelligence, assumption, machine learning, (19 more...)

2203.00977

Country:

Europe > United Kingdom (0.46)
North America > United States (0.45)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Machine LearningJun-26-2022

Conditional Simulation Using Diffusion Schr\"odinger Bridges

Shi, Yuyang, De Bortoli, Valentin, Deligiannidis, George, Doucet, Arnaud

Denoising diffusion models have recently emerged as a powerful class of generative models. They provide state-of-the-art results, not only for unconditional simulation, but also when used to solve conditional simulation problems arising in a wide range of inverse problems. A limitation of these models is that they are computationally intensive at generation time as they require simulating a diffusion process over a long time horizon. When performing unconditional simulation, a Schr\"odinger bridge formulation of generative modeling leads to a theoretically grounded algorithm shortening generation time which is complementary to other proposed acceleration techniques. We extend the Schr\"odinger bridge framework to conditional simulation. We demonstrate this novel methodology on various applications including image super-resolution, optimal filtering for state-space models and the refinement of pre-trained networks. Our code can be found at https://github.com/vdeborto/cdsb.

artificial intelligence, iteration, machine learning, (15 more...)

2202.1346

Country: Europe (0.28)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Machine LearningDec-1-2021

On Mixing Times of Metropolized Algorithm With Optimization Step (MAO) : A New Framework

Khribch, EL Mahdi, Deligiannidis, George, Paulin, Daniel

The ability to draw samples from a distribution is at the heart of many applications within the Bayesian paradigm and, more generally, in computational statistics. Markov Chain Monte Carlo pioneered by Metropolis et al. [1953], is often considered among practitioners as the default method for obtaining samples from distributions in a high-dimensional setting. In practice, variants of the Metropolis-Hastings enjoy tremendous success, notably in posterior exploration within a Bayesian setting Carpenter et al. [2017], Smith [2014]. In addition, Monte Carlo methods are commonly deployed in several applications: estimating the posterior mean, computing expectations of quantities of interest, and volumes of particular sets. Recently the research community has been interested in a noticeable manner in sampling methods and their interplay with the more established field of optimization Ma et al. [2019]. More specifically, due to the asymptotic nature of MCMC methods, a more tractable characterization of the dimension dependency of the convergence is an essential step in order to develop a better understanding of the convergence of this class of algorithms and to practical guidelines for practitioners.

algorithm, artificial intelligence, machine learning, (16 more...)

2112.00565

Country:

Europe > United Kingdom (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

arXiv.org Machine LearningOct-22-2021

Conditional Gaussian PAC-Bayes

Clerico, Eugenio, Deligiannidis, George, Doucet, Arnaud

Recent studies have empirically investigated different methods to train a stochastic classifier by optimising a PAC-Bayesian bound via stochastic gradient descent. Most of these procedures need to replace the misclassification error with a surrogate loss, leading to a mismatch between the optimisation objective and the actual generalisation bound. The present paper proposes a novel training algorithm that optimises the PAC-Bayesian bound, without relying on any surrogate loss. Empirical results show that the bounds obtained with this approach are tighter than those found in the literature.

artificial intelligence, bayesian inference, machine learning, (22 more...)

2110.11886

Country:

Europe > United Kingdom (0.14)
North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.67)