AITopics | Furtlehner, Cyril

Collaborating Authors

Furtlehner, Cyril

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A theoretical framework for overfitting in energy-based modeling

Catania, Giovanni, Decelle, Aurélien, Furtlehner, Cyril, Seoane, Beatriz

arXiv.org Artificial IntelligenceJan-31-2025

We investigate the impact of limited data on training pairwise energy-based models for inverse problems aimed at identifying interaction networks. Utilizing the Gaussian model as testbed, we dissect training trajectories across the eigenbasis of the coupling matrix, exploiting the independent evolution of eigenmodes and revealing that the learning timescales are tied to the spectral decomposition of the empirical covariance matrix. We see that optimal points for early stopping arise from the interplay between these timescales and the initial conditions of training. Moreover, we show that finite data corrections can be accurately modeled through asymptotic random matrix theory calculations and provide the counterpart of generalized cross-validation in the energy based model context. Our analytical framework extends to binary-variable maximum-entropy pairwise models with minimal variations. These findings offer strategies to control overfitting in discrete-variable models through empirical shrinkage corrections, improving the management of overfitting in energy-based generative models.

artificial intelligence, machine learning, matrix, (19 more...)

arXiv.org Artificial Intelligence

2501.19158

Country: Europe > Spain (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning

Schwencke, Nilo, Furtlehner, Cyril

arXiv.org Artificial IntelligenceDec-14-2024

In the recent years, Physics Informed Neural Networks (PINNs) have received strong interest as a method to solve PDE driven systems, in particular for data assimilation purpose. This method is still in its infancy, with many shortcomings and failures that remain not properly understood. In this paper we propose a natural gradient approach to PINNs which contributes to speed-up and improve the accuracy of the training. Based on an in depth analysis of the differential geometric structures of the problem, we come up with two distinct contributions: (i) a new natural gradient algorithm that scales as $\min(P^2S, S^2P)$, where $P$ is the number of parameters, and $S$ the batch size; (ii) a mathematically principled reformulation of the PINNs problem that allows the extension of natural gradient to it, with proved connections to Green's function theory.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2412.10782

Country: Europe (0.28)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fast, accurate training and sampling of Restricted Boltzmann Machines

Béreux, Nicolas, Decelle, Aurélien, Furtlehner, Cyril, Rosset, Lorenzo, Seoane, Beatriz

arXiv.org Artificial IntelligenceMay-24-2024

Thanks to their simple architecture, Restricted Boltzmann Machines (RBMs) are powerful tools for modeling complex systems and extracting interpretable insights from data. However, training RBMs, as other energy-based models, on highly structured data poses a major challenge, as effective training relies on mixing the Markov chain Monte Carlo simulations used to estimate the gradient. This process is often hindered by multiple second-order phase transitions and the associated critical slowdown. In this paper, we present an innovative method in which the principal directions of the dataset are integrated into a low-rank RBM through a convex optimization procedure. This approach enables efficient sampling of the equilibrium measure via a static Monte Carlo process. By starting the standard training process with a model that already accurately represents the main modes of the data, we bypass the initial phase transitions. Our results show that this strategy successfully trains RBMs to capture the full diversity of data in datasets where previous methods fail. Furthermore, we use the training trajectories to propose a new sampling method, {\em parallel trajectory tempering}, which allows us to sample the equilibrium measure of the trained model much faster than previous optimized MCMC approaches and a better estimation of the log-likelihood. We illustrate the success of the training method on several highly structured datasets.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2405.15376

Country:

Europe (0.46)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Inferring effective couplings with Restricted Boltzmann Machines

Decelle, Aurélien, Furtlehner, Cyril, Gómez, Alfonso De Jesus Navas, Seoane, Beatriz

arXiv.org Artificial IntelligenceJan-24-2024

Generative models offer a direct way of modeling complex data. Energy-based models attempt to encode the statistical correlations observed in the data at the level of the Boltzmann weight associated with an energy function in the form of a neural network. We address here the challenge of understanding the physical interpretation of such models. In this study, we propose a simple solution by implementing a direct mapping between the Restricted Boltzmann Machine and an effective Ising spin Hamiltonian. This mapping includes interactions of all possible orders, going beyond the conventional pairwise interactions typically considered in the inverse Ising (or Boltzmann Machine) approach, and allowing the description of complex datasets. Earlier works attempted to achieve this goal, but the proposed mappings were inaccurate for inference applications, did not properly treat the complexity of the problem, or did not provide precise prescriptions for practical application. To validate our method, we performed several controlled inverse numerical experiments in which we trained the RBMs using equilibrium samples of predefined models with local external fields, 2-body and 3-body interactions in different sparse topologies. The results demonstrate the effectiveness of our proposed approach in learning the correct interaction network and pave the way for its application in modeling interesting binary variable datasets. We also evaluate the quality of the inferred model based on different training methods.

artificial intelligence, coupling, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2309.02292

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Learning a Restricted Boltzmann Machine using biased Monte Carlo sampling

Béreux, Nicolas, Decelle, Aurélien, Furtlehner, Cyril, Seoane, Beatriz

arXiv.org Artificial IntelligenceOct-5-2022

Restricted Boltzmann Machines are simple and powerful generative models that can encode any complex dataset. Despite all their advantages, in practice the trainings are often unstable and it is difficult to assess their quality because the dynamics are affected by extremely slow time dependencies. This situation becomes critical when dealing with low-dimensional clustered datasets, where the time required to sample ergodically the trained models becomes computationally prohibitive. In this work, we show that this divergence of Monte Carlo mixing times is related to a phenomenon of phase coexistence, similar to that which occurs in physics near a first-order phase transition. We show that sampling the equilibrium distribution using the Markov chain Monte Carlo method can be dramatically accelerated when using biased sampling techniques, in particular the Tethered Monte Carlo (TMC) method. This sampling technique efficiently solves the problem of evaluating the quality of a given trained model and generating new samples in a reasonable amount of time. Moreover, we show that this sampling technique can also be used to improve the computation of the log-likelihood gradient during training, leading to dramatic improvements in training RBMs with artificial clustered datasets. On real low-dimensional datasets, this new training method fits RBM models with significantly faster relaxation dynamics than those obtained with standard PCD recipes. We also show that TMC sampling can be used to recover the free-energy profile of the RBM. This proves to be extremely useful to compute the probability distribution of a given model and to improve the generation of new decorrelated samples in slow PCD-trained models.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.21468/SciPostPhys.14.3.032

2206.0131

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)

Add feedback

Equilibrium and non-Equilibrium regimes in the learning of Restricted Boltzmann Machines

Decelle, Aurélien, Furtlehner, Cyril, Seoane, Beatriz

arXiv.org Artificial IntelligenceNov-23-2021

Training Restricted Boltzmann Machines (RBMs) has been challenging for a long time due to the difficulty of computing precisely the log-likelihood gradient. Over the past decades, many works have proposed more or less successful training recipes but without studying the crucial quantity of the problem: the mixing time, i.e. the number of Monte Carlo iterations needed to sample new configurations from a model. In this work, we show that this mixing time plays a crucial role in the dynamics and stability of the trained model, and that RBMs operate in two well-defined regimes, namely equilibrium and out-of-equilibrium, depending on the interplay between this mixing time of the model and the number of steps, $k$, used to approximate the gradient. We further show empirically that this mixing time increases with the learning, which often implies a transition from one regime to another as soon as $k$ becomes smaller than this time. In particular, we show that using the popular $k$ (persistent) contrastive divergence approaches, with $k$ small, the dynamics of the learned model are extremely slow and often dominated by strong out-of-equilibrium effects. On the contrary, RBMs trained in equilibrium display faster dynamics, and a smooth convergence to dataset-like configurations during the sampling. Finally we discuss how to exploit in practice both regimes depending on the task one aims to fulfill: (i) short $k$ can be used to generate convincing samples in short learning times, (ii) large $k$ (or increasingly large) is needed to learn the correct equilibrium distribution of the RBM. Finally, the existence of these two operational regimes seems to be a general property of energy based models trained via likelihood maximization.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/1742-5468/ac98a7

2105.13889

Country:

North America (0.46)
Europe (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.86)

Add feedback

Restricted Boltzmann Machine, recent advances and mean-field theory

Decelle, Aurélien, Furtlehner, Cyril

arXiv.org Artificial IntelligenceMay-28-2021

This review deals with Restricted Boltzmann Machine (RBM) under the light of statistical physics. The RBM is a classical family of Machine learning (ML) models which played a central role in the development of deep learning. Viewing it as a Spin Glass model and exhibiting various links with other models of statistical physics, we gather recent results dealing with mean-field theory in this context. First the functioning of the RBM can be analyzed via the phase diagrams obtained for various statistical ensembles of RBM leading in particular to identify a {\it compositional phase} where a small number of features or modes are combined to form complex patterns. Then we discuss recent works either able to devise mean-field based learning algorithms; either able to reproduce generic aspects of the learning process from some {\it ensemble dynamics equations} or/and from linear stability arguments.

artificial intelligence, machine learning, rbm, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/1674-1056/abd160

2011.11307

Country:

Europe (0.46)
North America (0.45)

Genre:

Research Report (0.50)
Overview (0.48)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Robust Multi-Output Learning with Highly Incomplete Data via Restricted Boltzmann Machines

Fissore, Giancarlo, Decelle, Aurélien, Furtlehner, Cyril, Han, Yufei

arXiv.org Machine LearningDec-19-2019

In a standard multi-output classification scenario, both features and labels of training data are partially observed. This challenging issue is widely witnessed due to sensor or database failures, crowd-sourcing and noisy communication channels in industrial data analytic services. Classic methods for handling multi-output classification with incomplete supervision information usually decompose the problem into an imputation stage that reconstructs the missing training information, and a learning stage that builds a classifier based on the imputed training set. These methods fail to fully leverage the dependencies between features and labels. In order to take full advantage of these dependencies we consider a purely probabilistic setting in which the features imputation and multi-label classification problems are jointly solved. Indeed, we show that a simple Restricted Boltzmann Machine can be trained with an adapted algorithm based on mean-field equations to efficiently solve problems of inductive and transductive learning in which both features and labels are missing at random. The effectiveness of the approach is demonstrated empirically on various datasets, with particular focus on a real-world Internet-of-Things security dataset.

dataset, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1912.09382

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (0.82)

Industry: Information Technology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Gaussian-Spherical Restricted Boltzmann Machines

Decelle, Aurélien, Furtlehner, Cyril

arXiv.org Artificial IntelligenceOct-31-2019

We consider a special type of Restricted Boltzmann machine (RBM), namely a Gaussian-spherical RBM where the visible units have Gaussian priors while the vector of hidden variables is constrained to stay on an ${\mathbbm L}_2$ sphere. The spherical constraint having the advantage to admit exact asymptotic treatments, various scaling regimes are explicitly identified based solely on the spectral properties of the coupling matrix (also called weight matrix of the RBM). Incidentally these happen to be formally related to similar scaling behaviours obtained in a different context dealing with spatial condensation of zero range processes. More specifically, when the spectrum of the coupling matrix is doubly degenerated an exact treatment can be proposed to deal with finite size effects. Interestingly the known parallel between the ferromagnetic transition of the spherical model and the Bose-Einstein condensation can be made explicit in that case. More importantly this gives us the ability to extract all needed response functions with arbitrary precision for the training algorithm of the RBM. This allows us then to numerically integrate the dynamics of the spectrum of the weight matrix during learning in a precise way. This dynamics reveals in particular a sequential emergence of modes from the Marchenko-Pastur bulk of singular vectors of the coupling matrix.

artificial intelligence, boltzmann machine, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/1751-8121/ab79f3

1910.14544

Country: North America > United States (0.67)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.64)

Add feedback

Using Latent Binary Variables for Online Reconstruction of Large Scale Systems

Martin, Victorin, Lasgouttes, Jean-Marc, Furtlehner, Cyril

arXiv.org Machine LearningDec-23-2013

We propose a probabilistic graphical model realizing a minimal encoding of real variables dependencies based on possibly incomplete observation and an empirical cumulative distribution function per variable. The target application is a large scale partially observed system, like e.g. a traffic network, where a small proportion of real valued variables are observed, and the other variables have to be predicted. Our design objective is therefore to have good scalability in a real-time setting. Instead of attempting to encode the dependencies of the system directly in the description space, we propose a way to encode them in a latent space of binary variables, reflecting a rough perception of the observable (congested/non-congested for a traffic road). The method relies in part on message passing algorithms, i.e. belief propagation, but the core of the work concerns the definition of meaningful latent variables associated to the variables of interest and their pairwise dependencies. Numerical experiments demonstrate the applicability of the method in practice.

artificial intelligence, latent variable, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1007/s10472-015-9470-x

1312.6607

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback