AITopics | Rey-Bellet, Luc

Collaborating Authors

Rey-Bellet, Luc

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Combining Wasserstein-1 and Wasserstein-2 proximals: robust manifold learning via well-posed generative flows

Gu, Hyemin, Katsoulakis, Markos A., Rey-Bellet, Luc, Zhang, Benjamin J.

arXiv.org Machine LearningJul-16-2024

We formulate well-posed continuous-time generative flows for learning distributions that are supported on low-dimensional manifolds through Wasserstein proximal regularizations of $f$-divergences. Wasserstein-1 proximal operators regularize $f$-divergences so that singular distributions can be compared. Meanwhile, Wasserstein-2 proximal operators regularize the paths of the generative flows by adding an optimal transport cost, i.e., a kinetic energy penalization. Via mean-field game theory, we show that the combination of the two proximals is critical for formulating well-posed generative flows. Generative flows can be analyzed through optimality conditions of a mean-field game (MFG), a system of a backward Hamilton-Jacobi (HJ) and a forward continuity partial differential equations (PDEs) whose solution characterizes the optimal generative flow. For learning distributions that are supported on low-dimensional manifolds, the MFG theory shows that the Wasserstein-1 proximal, which addresses the HJ terminal condition, and the Wasserstein-2 proximal, which addresses the HJ dynamics, are both necessary for the corresponding backward-forward PDE system to be well-defined and have a unique solution with provably linear flow trajectories. This implies that the corresponding generative flow is also unique and can therefore be learned in a robust manner even for learning high-dimensional distributions supported on low-dimensional manifolds. The generative flows are learned through adversarial training of continuous-time flows, which bypasses the need for reverse simulation. We demonstrate the efficacy of our approach for generating high-dimensional images without the need to resort to autoencoders or specialized architectures.

artificial intelligence, generative flow, machine learning, (18 more...)

arXiv.org Machine Learning

2407.11901

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.64)

Industry: Education (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Nonlinear denoising score matching for enhanced learning of structured distributions

Birrell, Jeremiah, Katsoulakis, Markos A., Rey-Bellet, Luc, Zhang, Benjamin, Zhu, Wei

arXiv.org Machine LearningMay-24-2024

We present a novel method for training score-based generative models which uses nonlinear noising dynamics to improve learning of structured distributions. Generalizing to a nonlinear drift allows for additional structure to be incorporated into the dynamics, thus making the training better adapted to the data, e.g., in the case of multimodality or (approximate) symmetries. Such structure can be obtained from the data by an inexpensive preprocessing step. The nonlinear dynamics introduces new challenges into training which we address in two ways: 1) we develop a new nonlinear denoising score matching (NDSM) method, 2) we introduce neural control variates in order to reduce the variance of the NDSM training objective. We demonstrate the effectiveness of this method on several examples: a) a collection of low-dimensional examples, motivated by clustering in latent space, b) high-dimensional images, addressing issues with mode collapse, small training sets, and approximate symmetries, the latter being a challenge for methods based on equivariant neural networks, which require exact symmetries.

artificial intelligence, machine learning, objective, (15 more...)

arXiv.org Machine Learning

2405.15625

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.15)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning heavy-tailed distributions with Wasserstein-proximal-regularized $\alpha$-divergences

Chen, Ziyu, Gu, Hyemin, Katsoulakis, Markos A., Rey-Bellet, Luc, Zhu, Wei

arXiv.org Machine LearningMay-22-2024

Heavy tails are ubiquitous, emerging in various fields such as extreme events in ocean waves [9], floods [21], social sciences [27, 16], human activities [17, 35], biology [18] and computer sciences [29]. Learning to generate heavy-tailed target distributions has been explored using GANs through tail estimation [10, 15, 1]. While estimating the tail behavior of a heavy-tailed distribution is important, selecting objectives that measure discrepancies between these distributions and facilitate stable learning is equally crucial. In generative modeling, the goal is to generate samples that mimic those from an underlying data distribution, typically by designing algorithms that minimize a probability divergence between the generated and target distributions. Thus, it is crucial to choose a divergence that flexibly and accurately respects the behavior of the data distribution.

artificial intelligence, heavy-tailed distribution, machine learning, (17 more...)

arXiv.org Machine Learning

2405.13962

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Statistical Guarantees of Group-Invariant GANs

Chen, Ziyu, Katsoulakis, Markos A., Rey-Bellet, Luc, Zhu, Wei

arXiv.org Machine LearningOct-16-2023

Group-invariant generative adversarial networks (GANs) are a type of GANs in which the generators and discriminators are hardwired with group symmetries. Empirical studies have shown that these networks are capable of learning group-invariant distributions with significantly improved data efficiency. In this study, we aim to rigorously quantify this improvement by analyzing the reduction in sample complexity for group-invariant GANs. Our findings indicate that when learning group-invariant distributions, the number of samples required for group-invariant GANs decreases proportionally with a power of the group size, and this power depends on the intrinsic dimension of the distribution's support. To our knowledge, this work presents the first statistical estimation for group-invariant generative models, specifically for GANs, and it may shed light on the study of other group-invariant generative models.

artificial intelligence, group-invariant gan, machine learning, (18 more...)

arXiv.org Machine Learning

2305.13517

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Lipschitz-regularized gradient flows and generative particle algorithms for high-dimensional scarce data

Gu, Hyemin, Birmpa, Panagiota, Pantazis, Yannis, Rey-Bellet, Luc, Katsoulakis, Markos A.

arXiv.org Artificial IntelligenceJul-24-2023

We build a new class of generative algorithms capable of efficiently learning an arbitrary target distribution from possibly scarce, high-dimensional data and subsequently generate new samples. These generative algorithms are particle-based and are constructed as gradient flows of Lipschitz-regularized Kullback-Leibler or other $f$-divergences, where data from a source distribution can be stably transported as particles, towards the vicinity of the target distribution. As a highlighted result in data integration, we demonstrate that the proposed algorithms correctly transport gene expression data points with dimension exceeding 54K, while the sample size is typically only in the hundreds.

artificial intelligence, machine learning, particle, (19 more...)

arXiv.org Artificial Intelligence

2210.1723

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (0.64)

Industry:

Energy > Oil & Gas > Upstream (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science (0.88)

Add feedback

Sample Complexity of Probability Divergences under Group Symmetry

Chen, Ziyu, Katsoulakis, Markos A., Rey-Bellet, Luc, Zhu, Wei

arXiv.org Machine LearningMay-22-2023

We rigorously quantify the improvement in the sample complexity of variational divergence estimations for group-invariant distributions. In the cases of the Wasserstein-1 metric and the Lipschitz-regularized $\alpha$-divergences, the reduction of sample complexity is proportional to an ambient-dimension-dependent power of the group size. For the maximum mean discrepancy (MMD), the improvement of sample complexity is more nuanced, as it depends on not only the group size but also the choice of kernel. Numerical simulations verify our theories.

artificial intelligence, machine learning, probability divergence, (11 more...)

arXiv.org Machine Learning

2302.01915

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Function-space regularized R\'enyi divergences

Birrell, Jeremiah, Pantazis, Yannis, Dupuis, Paul, Katsoulakis, Markos A., Rey-Bellet, Luc

arXiv.org Artificial IntelligenceFeb-14-2023

We propose a new family of regularized R\'enyi divergences parametrized not only by the order $\alpha$ but also by a variational function space. These new objects are defined by taking the infimal convolution of the standard R\'enyi divergence with the integral probability metric (IPM) associated with the chosen function space. We derive a novel dual variational representation that can be used to construct numerically tractable divergence estimators. This representation avoids risk-sensitive terms and therefore exhibits lower variance, making it well-behaved when $\alpha>1$; this addresses a notable weakness of prior approaches. We prove several properties of these new divergences, showing that they interpolate between the classical R\'enyi divergences and IPMs. We also study the $\alpha\to\infty$ limit, which leads to a regularized worst-case-regret and a new variational representation in the classical case. Moreover, we show that the proposed regularized R\'enyi divergences inherit features from IPMs such as the ability to compare distributions that are not absolutely continuous, e.g., empirical measures and distributions with low-dimensional support. We present numerical results on both synthetic and real datasets, showing the utility of these new divergences in both estimation and GAN training applications; in particular, we demonstrate significantly reduced variance and improved training performance.

artificial intelligence, divergence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2210.04974

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Structure-preserving GANs

Birrell, Jeremiah, Katsoulakis, Markos A., Rey-Bellet, Luc, Zhu, Wei

arXiv.org Machine LearningFeb-2-2022

Generative adversarial networks (GANs), a class of distribution-learning methods based on a two-player game between a generator and a discriminator, can generally be formulated as a minmax problem based on the variational representation of a divergence between the unknown and the generated distributions. We introduce structure-preserving GANs as a data-efficient framework for learning distributions with additional structure such as group symmetry, by developing new variational representations for divergences. Our theory shows that we can reduce the discriminator space to its projection on the invariant discriminator space, using the conditional expectation with respect to the $\sigma$-algebra associated to the underlying structure. In addition, we prove that the discriminator space reduction must be accompanied by a careful design of structured generators, as flawed designs may easily lead to a catastrophic "mode collapse" of the learned distribution. We contextualize our framework by building symmetry-preserving GANs for distributions with intrinsic group symmetry, and demonstrate that both players, namely the equivariant generator and invariant discriminator, play important but distinct roles in the learning process. Empirical experiments and ablation studies across a broad range of data sets, including real-world medical imaging, validate our theory, and show our proposed methods achieve significantly improved sample fidelity and diversity -- almost an order of magnitude measured in Fr\'echet Inception Distance -- especially in the small data regime.

diagnostic medicine, machine learning, teaching method, (20 more...)

arXiv.org Machine Learning

2202.01129

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Model Uncertainty and Correctability for Directed Graphical Models

Birmpa, Panagiota, Feng, Jinchao, Katsoulakis, Markos A., Rey-Bellet, Luc

arXiv.org Machine LearningJul-17-2021

Probabilistic graphical models are a fundamental tool in probabilistic modeling, machine learning and artificial intelligence. They allow us to integrate in a natural way expert knowledge, physical modeling, heterogeneous and correlated data and quantities of interest. For exactly this reason, multiple sources of model uncertainty are inherent within the modular structure of the graphical model. In this paper we develop information-theoretic, robust uncertainty quantification methods and non-parametric stress tests for directed graphical models to assess the effect and the propagation through the graph of multi-sourced model uncertainties to quantities of interest. These methods allow us to rank the different sources of uncertainty and correct the graphical model by targeting its most impactful components with respect to the quantities of interest. Thus, from a machine learning perspective, we provide a mathematically rigorous approach to correctability that guarantees a systematic selection for improvement of components of a graphical model while controlling potential new errors created in the process in other parts of the model. We demonstrate our methods in two physico-chemical examples, namely quantum scale-informed chemical kinetics and materials screening to improve the efficiency of fuel cells.

bayesian inference, bayesian network, us government, (17 more...)

arXiv.org Machine Learning

2107.08179

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.81)

Industry:

Government > Military (0.45)
Government > Regional Government > North America Government > United States Government (0.45)
Energy > Renewable > Hydrogen (0.34)
Energy > Energy Storage (0.34)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
(2 more...)

Add feedback

$(f,\Gamma)$-Divergences: Interpolating between $f$-Divergences and Integral Probability Metrics

Birrell, Jeremiah, Dupuis, Paul, Katsoulakis, Markos A., Pantazis, Yannis, Rey-Bellet, Luc

arXiv.org Machine LearningNov-11-2020

We develop a general framework for constructing new information-theoretic divergences that rigorously interpolate between $f$-divergences and integral probability metrics (IPMs), such as the Wasserstein distance. These new divergences inherit features from IPMs, such as the ability to compare distributions which are not absolute continuous, as well as from $f$-divergences, for instance the strict concavity of their variational representations and the ability to compare heavy-tailed distributions. When combined, these features establish a divergence with improved convergence and estimation properties for statistical learning applications. We demonstrate their use in the training of generative adversarial networks (GAN) for heavy-tailed data and also show they can provide improved performance over gradient-penalized Wasserstein GAN in image generation.

artificial intelligence, neural network, ovember 12, (17 more...)

arXiv.org Machine Learning

2011.05953

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback