AITopics | Decelle, Aurélien

Collaborating Authors

Decelle, Aurélien

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Explaining the effects of non-convergent sampling in the training of Energy-Based Models

Agoritsas, Elisabeth, Catania, Giovanni, Decelle, Aurélien, Seoane, Beatriz

arXiv.org Artificial IntelligenceMay-31-2023

In this paper, we quantify the impact of using nonconvergent Markov chains to train Energy-Based EBMs offer several fundamental advantages over their competitors models (EBMs). In particular, we show analytically due to their simplicity: A single neural network is that EBMs trained with non-persistent short involved in training, which means that fewer parameters runs to estimate the gradient can perfectly reproduce need to be learned and training is less costly. They are also a set of empirical statistics of the data, not at particularly appealing for interpretive applications: Once the level of the equilibrium measure, but through trained, the energy function can be analyzed with statistical a precise dynamical process. Our results provide a mechanics tools (Decelle & Furtlehner, 2021b), or shallow first-principles explanation for the observations of EBMs can serve as an effective model to "learn" something recent works proposing the strategy of using short from the data. EBMs have been exploited for instance to runs starting from random initial conditions as an infer the three dimensional structure (Morcos et al., 2011) efficient way to generate high-quality samples in or building blocks (Tubiana et al., 2019) of proteins, to generate EBMs, and lay the groundwork for using EBMs artificial pieces of genome (Yelmen et al., 2021), for as diffusion models. After explaining this effect in neuroimaging (Hjelm et al., 2014), simulation of complex generic EBMs, we analyze two solvable models in wavefunctions in quantum many-body physics (Carleo & which the effect of the non-convergent sampling Troyer, 2017; Melko et al., 2019), or to impute missing in the trained parameters can be described in detail.

artificial intelligence, eigenvalue, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2301.09428

Country:

Europe (0.68)
North America > United States > Hawaii (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.37)

Add feedback

Thermodynamics of bidirectional associative memories

Barra, Adriano, Catania, Giovanni, Decelle, Aurélien, Seoane, Beatriz

arXiv.org Artificial IntelligenceMar-27-2023

In this paper we investigate the equilibrium properties of bidirectional associative memories (BAMs). Introduced by Kosko in 1988 as a generalization of the Hopfield model to a bipartite structure, the simplest architecture is defined by two layers of neurons, with synaptic connections only between units of different layers: even without internal connections within each layer, information storage and retrieval are still possible through the reverberation of neural activities passing from one layer to another. We characterize the computational capabilities of a stochastic extension of this model in the thermodynamic limit, by applying rigorous techniques from statistical physics. A detailed picture of the phase diagram at the replica symmetric level is provided, both at finite temperature and in the noiseless regimes. Also for the latter, the critical load is further investigated up to one step of replica symmetry breaking. An analytical and numerical inspection of the transition curves (namely critical lines splitting the various modes of operation of the machine) is carried out as the control parameters - noise, load and asymmetry between the two layer sizes - are tuned. In particular, with a finite asymmetry between the two layers, it is shown how the BAM can store information more efficiently than the Hopfield model by requiring less parameters to encode a fixed number of patterns. Comparisons are made with numerical simulations of neural dynamics. Finally, a low-load analysis is carried out to explain the retrieval mechanism in the BAM by analogy with two interacting Hopfield models. A potential equivalence with two coupled Restricted Boltmzann Machines is also discussed.

artificial intelligence, hopfield model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/1751-8121/accc60

2211.09694

Country: Europe (0.68)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.90)

Add feedback

Learning a Restricted Boltzmann Machine using biased Monte Carlo sampling

Béreux, Nicolas, Decelle, Aurélien, Furtlehner, Cyril, Seoane, Beatriz

arXiv.org Artificial IntelligenceOct-5-2022

Restricted Boltzmann Machines are simple and powerful generative models that can encode any complex dataset. Despite all their advantages, in practice the trainings are often unstable and it is difficult to assess their quality because the dynamics are affected by extremely slow time dependencies. This situation becomes critical when dealing with low-dimensional clustered datasets, where the time required to sample ergodically the trained models becomes computationally prohibitive. In this work, we show that this divergence of Monte Carlo mixing times is related to a phenomenon of phase coexistence, similar to that which occurs in physics near a first-order phase transition. We show that sampling the equilibrium distribution using the Markov chain Monte Carlo method can be dramatically accelerated when using biased sampling techniques, in particular the Tethered Monte Carlo (TMC) method. This sampling technique efficiently solves the problem of evaluating the quality of a given trained model and generating new samples in a reasonable amount of time. Moreover, we show that this sampling technique can also be used to improve the computation of the log-likelihood gradient during training, leading to dramatic improvements in training RBMs with artificial clustered datasets. On real low-dimensional datasets, this new training method fits RBM models with significantly faster relaxation dynamics than those obtained with standard PCD recipes. We also show that TMC sampling can be used to recover the free-energy profile of the RBM. This proves to be extremely useful to compute the probability distribution of a given model and to improve the generation of new decorrelated samples in slow PCD-trained models.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.21468/SciPostPhys.14.3.032

2206.0131

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)

Add feedback

Equilibrium and non-Equilibrium regimes in the learning of Restricted Boltzmann Machines

Decelle, Aurélien, Furtlehner, Cyril, Seoane, Beatriz

arXiv.org Artificial IntelligenceNov-23-2021

Training Restricted Boltzmann Machines (RBMs) has been challenging for a long time due to the difficulty of computing precisely the log-likelihood gradient. Over the past decades, many works have proposed more or less successful training recipes but without studying the crucial quantity of the problem: the mixing time, i.e. the number of Monte Carlo iterations needed to sample new configurations from a model. In this work, we show that this mixing time plays a crucial role in the dynamics and stability of the trained model, and that RBMs operate in two well-defined regimes, namely equilibrium and out-of-equilibrium, depending on the interplay between this mixing time of the model and the number of steps, $k$, used to approximate the gradient. We further show empirically that this mixing time increases with the learning, which often implies a transition from one regime to another as soon as $k$ becomes smaller than this time. In particular, we show that using the popular $k$ (persistent) contrastive divergence approaches, with $k$ small, the dynamics of the learned model are extremely slow and often dominated by strong out-of-equilibrium effects. On the contrary, RBMs trained in equilibrium display faster dynamics, and a smooth convergence to dataset-like configurations during the sampling. Finally we discuss how to exploit in practice both regimes depending on the task one aims to fulfill: (i) short $k$ can be used to generate convincing samples in short learning times, (ii) large $k$ (or increasingly large) is needed to learn the correct equilibrium distribution of the RBM. Finally, the existence of these two operational regimes seems to be a general property of energy based models trained via likelihood maximization.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/1742-5468/ac98a7

2105.13889

Country:

North America (0.46)
Europe (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.86)

Add feedback

Restricted Boltzmann Machine, recent advances and mean-field theory

Decelle, Aurélien, Furtlehner, Cyril

arXiv.org Artificial IntelligenceMay-28-2021

This review deals with Restricted Boltzmann Machine (RBM) under the light of statistical physics. The RBM is a classical family of Machine learning (ML) models which played a central role in the development of deep learning. Viewing it as a Spin Glass model and exhibiting various links with other models of statistical physics, we gather recent results dealing with mean-field theory in this context. First the functioning of the RBM can be analyzed via the phase diagrams obtained for various statistical ensembles of RBM leading in particular to identify a {\it compositional phase} where a small number of features or modes are combined to form complex patterns. Then we discuss recent works either able to devise mean-field based learning algorithms; either able to reproduce generic aspects of the learning process from some {\it ensemble dynamics equations} or/and from linear stability arguments.

artificial intelligence, machine learning, rbm, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/1674-1056/abd160

2011.11307

Country:

Europe (0.46)
North America (0.45)

Genre:

Research Report (0.50)
Overview (0.48)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Robust Multi-Output Learning with Highly Incomplete Data via Restricted Boltzmann Machines

Fissore, Giancarlo, Decelle, Aurélien, Furtlehner, Cyril, Han, Yufei

arXiv.org Machine LearningDec-19-2019

In a standard multi-output classification scenario, both features and labels of training data are partially observed. This challenging issue is widely witnessed due to sensor or database failures, crowd-sourcing and noisy communication channels in industrial data analytic services. Classic methods for handling multi-output classification with incomplete supervision information usually decompose the problem into an imputation stage that reconstructs the missing training information, and a learning stage that builds a classifier based on the imputed training set. These methods fail to fully leverage the dependencies between features and labels. In order to take full advantage of these dependencies we consider a purely probabilistic setting in which the features imputation and multi-label classification problems are jointly solved. Indeed, we show that a simple Restricted Boltzmann Machine can be trained with an adapted algorithm based on mean-field equations to efficiently solve problems of inductive and transductive learning in which both features and labels are missing at random. The effectiveness of the approach is demonstrated empirically on various datasets, with particular focus on a real-world Internet-of-Things security dataset.

dataset, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1912.09382

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (0.82)

Industry: Information Technology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Gaussian-Spherical Restricted Boltzmann Machines

Decelle, Aurélien, Furtlehner, Cyril

arXiv.org Artificial IntelligenceOct-31-2019

We consider a special type of Restricted Boltzmann machine (RBM), namely a Gaussian-spherical RBM where the visible units have Gaussian priors while the vector of hidden variables is constrained to stay on an ${\mathbbm L}_2$ sphere. The spherical constraint having the advantage to admit exact asymptotic treatments, various scaling regimes are explicitly identified based solely on the spectral properties of the coupling matrix (also called weight matrix of the RBM). Incidentally these happen to be formally related to similar scaling behaviours obtained in a different context dealing with spatial condensation of zero range processes. More specifically, when the spectrum of the coupling matrix is doubly degenerated an exact treatment can be proposed to deal with finite size effects. Interestingly the known parallel between the ferromagnetic transition of the spherical model and the Bose-Einstein condensation can be made explicit in that case. More importantly this gives us the ability to extract all needed response functions with arbitrary precision for the training algorithm of the RBM. This allows us then to numerically integrate the dynamics of the spectrum of the weight matrix during learning in a precise way. This dynamics reveals in particular a sequential emergence of modes from the Marchenko-Pastur bulk of singular vectors of the coupling matrix.

artificial intelligence, boltzmann machine, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/1751-8121/ab79f3

1910.14544

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.64)

Add feedback