AITopics | Aubin, Benjamin

Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation

Chadebec, Clement, Tasar, Onur, Benaroche, Eyal, Aubin, Benjamin

arXiv.org Artificial IntelligenceJun-5-2024

In this paper, we propose an efficient, fast, and versatile distillation method to accelerate the generation of pre-trained diffusion models: Flash Diffusion. The method reaches state-of-the-art performances in terms of FID and CLIP-Score for few steps image generation on the COCO2014 and COCO2017 datasets, while requiring only several GPU hours of training and fewer trainable parameters than existing methods. In addition to its efficiency, the versatility of the method is also exposed across several tasks such as text-to-image, inpainting, face-swapping, super-resolution and using different backbones such as UNet-based denoisers (SD1.5, SDXL) or DiT (Pixart-$\alpha$), as well as adapters. In all cases, the method allowed to reduce drastically the number of sampling steps while maintaining very high-quality image generation. The official implementation is available at https://github.com/gojasper/flash-diffusion.

artificial intelligence, machine learning, student model, (15 more...)

arXiv.org Artificial Intelligence

2406.02347

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.64)

Industry: Education (0.31)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Linear unit-tests for invariance discovery

Aubin, Benjamin, Słowik, Agnieszka, Arjovsky, Martin, Bottou, Leon, Lopez-Paz, David

arXiv.org Artificial IntelligenceFeb-22-2021

There is an increasing interest in algorithms to learn invariant correlations across training environments. A big share of the current proposals find theoretical support in the causality literature but, how useful are they in practice? The purpose of this note is to propose six linear low-dimensional problems --"unit tests"-- to evaluate different types of out-of-distribution generalization in a precise manner. Following initial experiments, none of the three recently proposed alternatives passes all tests.

artificial intelligence, inv, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2102.10867

Country:

Europe (0.29)
North America > United States > New York (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization

Aubin, Benjamin, Krzakala, Florent, Lu, Yue M., Zdeborová, Lenka

arXiv.org Machine LearningNov-7-2020

We consider a commonly studied supervised classification of a synthetic dataset whose labels are generated by feeding a one-layer neural network with random iid inputs. We study the generalization performances of standard classifiers in the high-dimensional regime where $\alpha=n/d$ is kept finite in the limit of a high dimension $d$ and number of samples $n$. Our contribution is three-fold: First, we prove a formula for the generalization error achieved by $\ell_2$ regularized classifiers that minimize a convex loss. This formula was first obtained by the heuristic replica method of statistical physics. Secondly, focussing on commonly used loss functions and optimizing the $\ell_2$ regularization strength, we observe that while ridge regression performance is poor, logistic and hinge regression are surprisingly able to approach the Bayes-optimal generalization error extremely closely. As $\alpha \to \infty$ they lead to Bayes-optimal rates, a fact that does not follow from predictions of margin-based generalization error bounds. Third, we design an optimal loss and regularizer that provably leads to Bayes-optimal generalization error.

bayesian inference, generalization error, neural network, (19 more...)

arXiv.org Machine Learning

2006.0656

Country:

North America > United States (0.93)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.83)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.40)

Add feedback

The spiked matrix model with generative priors

Aubin, Benjamin, Loureiro, Bruno, Maillard, Antoine, Krzakala, Florent, Zdeborová, Lenka

Neural Information Processing SystemsMar-19-2020, 00:02:11 GMT

Using a low-dimensional parametrization of signals is a generic and powerful way to enhance performance in signal processing and statistical inference. A very popular and widely explored type of dimensionality reduction is sparsity; another type is generative modelling of signal distributions. Generative models based on neural networks, such as GANs or variational auto-encoders, are particularly performant and are gaining on applicability. In this paper we study spiked matrix models, where a low-rank matrix is observed through a noisy channel. This problem with sparse structure of the spikes has attracted broad attention in the past literature.

artificial intelligence, neural network, spiked matrix model, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.42)

Add feedback

Rademacher complexity and spin glasses: A link between the replica and statistical theories of learning

Abbara, Alia, Aubin, Benjamin, Krzakala, Florent, Zdeborová, Lenka

arXiv.org Machine LearningDec-5-2019

Statistical learning theory provides bounds of the generalization gap, using in particular the Vapnik-Chervonenkis dimension and the Rademacher complexity. An alternative approach, mainly studied in the statistical physics literature, is the study of generalization in simple synthetic-data models. Here we discuss the connections between these approaches and focus on the link between the Rademacher complexity in statistical learning and the theories of generalization for typical-case synthetic models from statistical physics, involving quantities known as Gardner capacity and ground state energy. We show that in these models the Rademacher complexity is closely related to the ground state energy computed by replica theories. Using this connection, one may reinterpret many results of the literature as rigorous Rademacher bounds in a variety of models in the high-dimensional statistics limit. Somewhat surprisingly, we also show that statistical learning theory provides predictions for the behavior of the ground-state energies in some full replica symmetry breaking models.

artificial intelligence, neural network, rademacher complexity, (17 more...)

arXiv.org Machine Learning

1912.02729

Country: Europe > France (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.89)

Add feedback

The spiked matrix model with generative priors

Aubin, Benjamin, Loureiro, Bruno, Maillard, Antoine, Krzakala, Florent, Zdeborová, Lenka

arXiv.org Machine LearningMay-30-2019

Using a low-dimensional parametrization of signals is a generic and powerful way to enhance performance in signal processing and statistical inference. A very popular and widely explored type of dimensionality reduction is sparsity; another type is generative modelling of signal distributions. Generative models based on neural networks, such as GANs or variational auto-encoders, are particularly performant and are gaining on applicability. In this paper we study spiked matrix models, where a low-rank matrix is observed through a noisy channel. This problem with sparse structure of the spikes has attracted broad attention in the past literature. Here, we replace the sparsity assumption by generative modelling, and investigate the consequences on statistical and algorithmic properties. We analyze the Bayes-optimal performance under specific generative models for the spike. In contrast with the sparsity assumption, we do not observe regions of parameters where statistical performance is superior to the best known algorithmic performance. We show that in the analyzed cases the approximate message passing algorithm is able to reach optimal performance. We also design enhanced spectral algorithms and analyze their performance and thresholds using random matrix theory, showing their superiority to the classical principal component analysis. We complement our theoretical results by illustrating the performance of the spectral algorithms when the spikes come from real datasets.

artificial intelligence, equation, neural network, (16 more...)

arXiv.org Machine Learning

1905.12385

Country:

Europe (0.67)
North America > United States (0.45)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The committee machine: Computational to statistical gaps in learning a two-layers neural network

Aubin, Benjamin, Maillard, Antoine, barbier, jean, Krzakala, Florent, Macris, Nicolas, Zdeborová, Lenka

Neural Information Processing SystemsDec-31-2018

Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks. In this contribution, we provide a rigorous justification of these approaches for a two-layers neural network model called the committee machine. We also introduce a version of the approximate message passing (AMP) algorithm for the committee machine that allows to perform optimal learning in polynomial time for a large set of parameters. We find that there are regimes in which a low generalization error is information-theoretically achievable while the AMP algorithm fails to deliver it; strongly suggesting that no efficient algorithm exists for those cases, and unveiling a large computational gap. While the traditional approach to learning and generalization follows the Vapnik-Chervonenkis [1] and Rademacher [2] worst-case type bounds, there has been a considerable body of theoretical work on calculating the generalization ability of neural networks for data arising from a probabilistic model within the framework of statistical mechanics [3, 4, 5, 6, 7]. In the wake of the need to understand the effectiveness of neural networks and also the limitations of the classical approaches [8], it is of interest to revisit the results that have emerged thanks to the physics perspective. This direction is currently experiencing a strong revival, see e.g.

algorithm, artificial intelligence, neural network, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > France (0.14)
North America > Canada (0.14)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The committee machine: Computational to statistical gaps in learning a two-layers neural network

Aubin, Benjamin, Maillard, Antoine, barbier, jean, Krzakala, Florent, Macris, Nicolas, Zdeborová, Lenka

Neural Information Processing SystemsDec-31-2018

Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks. In this contribution, we provide a rigorous justification of these approaches for a two-layers neural network model called the committee machine. We also introduce a version of the approximate message passing (AMP) algorithm for the committee machine that allows to perform optimal learning in polynomial time for a large set of parameters. We find that there are regimes in which a low generalization error is information-theoretically achievable while the AMP algorithm fails to deliver it; strongly suggesting that no efficient algorithm exists for those cases, and unveiling a large computational gap. While the traditional approach to learning and generalization follows the Vapnik-Chervonenkis [1] and Rademacher [2] worst-case type bounds, there has been a considerable body of theoretical work on calculating the generalization ability of neural networks for data arising from a probabilistic model within the framework of statistical mechanics [3, 4, 5, 6, 7]. In the wake of the need to understand the effectiveness of neural networks and also the limitations of the classical approaches [8], it is of interest to revisit the results that have emerged thanks to the physics perspective. This direction is currently experiencing a strong revival, see e.g.

algorithm, artificial intelligence, neural network, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > France (0.14)
North America > Canada (0.14)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The committee machine: Computational to statistical gaps in learning a two-layers neural network

Aubin, Benjamin, Maillard, Antoine, Barbier, Jean, Krzakala, Florent, Macris, Nicolas, Zdeborová, Lenka

arXiv.org Machine LearningJun-14-2018

Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks. In this contribution, we provide a rigorous justification of these approaches for a two-layers neural network model called the committee machine. We also introduce a version of the approximate message passing (AMP) algorithm for the committee machine that allows to perform optimal learning in polynomial time for a large set of parameters. We find that there are regimes in which a low generalization error is information-theoretically achievable while the AMP algorithm fails to deliver it, strongly suggesting that no efficient algorithm exists for those cases, and unveiling a large computational gap.

artificial intelligence, generalization error, neural network, (16 more...)

arXiv.org Machine Learning

1806.05451

Country: Europe > United Kingdom > England (0.27)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

Aubin, Benjamin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation

Linear unit-tests for invariance discovery

Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization

The spiked matrix model with generative priors

Rademacher complexity and spin glasses: A link between the replica and statistical theories of learning

The spiked matrix model with generative priors

The committee machine: Computational to statistical gaps in learning a two-layers neural network

The committee machine: Computational to statistical gaps in learning a two-layers neural network

The committee machine: Computational to statistical gaps in learning a two-layers neural network