Goto

Collaborating Authors

 Bayesian Inference


Constrained multi-objective optimization of process design parameters in settings with scarce data: an application to adhesive bonding

arXiv.org Artificial Intelligence

Adhesive joints are increasingly used in industry for a wide variety of applications because of their favorable characteristics such as high strength-to-weight ratio, design flexibility, limited stress concentrations, planar force transfer, good damage tolerance, and fatigue resistance. Finding the optimal process parameters for an adhesive bonding process is challenging: the optimization is inherently multi-objective (aiming to maximize break strength while minimizing cost), constrained (the process should not result in any visual damage to the materials, and stress tests should not result in failures that are adhesion-related), and uncertain (testing the same process parameters several times may lead to different break strengths). Real-life physical experiments in the lab are expensive to perform. Traditional evolutionary approaches (such as genetic algorithms) are then ill-suited to solve the problem, due to the prohibitive amount of experiments required for evaluation. Although Bayesian optimization-based algorithms are preferred to solve such expensive problems, few methods consider the optimization of more than one (noisy) objective and several constraints at the same time. In this research, we successfully applied specific machine learning techniques (Gaussian Process Regression) to emulate the objective and constraint functions based on a limited amount of experimental data. The techniques are embedded in a Bayesian optimization algorithm, which succeeds in detecting Pareto-optimal process settings in a highly efficient way (i.e., requiring a limited number of physical experiments).


Does Bayesian inference work in all Machine Learning scenarios part5

#artificialintelligence

Abstract: General robotic grippers are challenging to control because of their rich nonsmooth contact dynamics and the many sources of uncertainties due to the environment or sensor noise. In this work, we demonstrate how to compute 6-DoF grasp poses using simulation-based Bayesian inference through the full stochastic forward simulation of the robot in its environment while robustly accounting for many of the uncertainties in the system. A Riemannian manifold optimization procedure preserving the nonlinearity of the rotation space is used to compute the maximum a posteriori grasp pose. Abstract: Tensor decomposition is now being used for data analysis, information compression, and knowledge recovery. However, the mathematical property of tensor decomposition is not yet fully clarified because it is one of singular learning machines.


Bayesian learning of Causal Structure and Mechanisms with GFlowNets and Variational Bayes

arXiv.org Artificial Intelligence

Bayesian causal structure learning aims to learn a posterior distribution over directed acyclic graphs (DAGs), and the mechanisms that define the relationship between parent and child variables. By taking a Bayesian approach, it is possible to reason about the uncertainty of the causal model. The notion of modelling the uncertainty over models is particularly crucial for causal structure learning since the model could be unidentifiable when given only a finite amount of observational data. In this paper, we introduce a novel method to jointly learn the structure and mechanisms of the causal model using Variational Bayes, which we call Variational Bayes-DAG-GFlowNet (VBG). We extend the method of Bayesian causal structure learning using GFlowNets to learn not only the posterior distribution over the structure, but also the parameters of a linear-Gaussian model. Our results on simulated data suggest that VBG is competitive against several baselines in modelling the posterior over DAGs and mechanisms, while offering several advantages over existing methods, including the guarantee to sample acyclic graphs, and the flexibility to generalize to non-linear causal mechanisms.


Correcting Model Misspecification via Generative Adversarial Networks

arXiv.org Artificial Intelligence

Machine learning models are often misspecified in the likelihood, which leads to a lack of robustness in the predictions. In this paper, we introduce a framework for correcting likelihood misspecifications in several paradigm agnostic noisy prior models and test the model's ability to remove the misspecification. The "ABC-GAN" framework introduced is a novel generative modeling paradigm, which combines Generative Adversarial Networks (GANs) and Approximate Bayesian Computation (ABC). This new paradigm assists the existing GANs by incorporating any subjective knowledge available about the modeling process via ABC, as a regularizer, resulting in a partially interpretable model that operates well under low data regimes. At the same time, unlike any Bayesian analysis, the explicit knowledge need not be perfect, since the generator in the GAN can be made arbitrarily complex. ABC-GAN eliminates the need for summary statistics and distance metrics as the discriminator implicitly learns them and enables simultaneous specification of multiple generative models. The model misspecification is simulated in our experiments by introducing noise of various biases and variances. The correction term is learnt via the ABC-GAN, with skip connections, referred to as skipGAN. The strength of the skip connection indicates the amount of correction needed or how misspecified the prior model is. Based on a simple experimental setup, we show that the ABC-GAN models not only correct the misspecification of the prior, but also perform as well as or better than the respective priors under noisier conditions. In this proposal, we show that ABC-GANs get the best of both worlds.


Conservative objective models are a special kind of contrastive divergence-based energy model

arXiv.org Artificial Intelligence

In this work we theoretically show that conservative objective models (COMs) for offline model-based optimisation (MBO) are a special kind of contrastive divergence-based energy model, one where the energy function represents both the unconditional probability of the input and the conditional probability of the reward variable. While the initial formulation only samples modes from its learned distribution, we propose a simple fix that replaces its gradient ascent sampler with a Langevin MCMC sampler. This gives rise to a special probabilistic model where the probability of sampling an input is proportional to its predicted reward. Lastly, we show that better samples can be obtained if the model is decoupled so that the unconditional and conditional probabilities are modelled separately.


Wavelet Score-Based Generative Modeling Simon Coste Computer Science Department, Computer Science Department, ENS, CNRS, PSL University ENS, CNRS, PSL University Valentin De Bortoli

Neural Information Processing Systems

Score-based generative models (SGMs) synthesize new data samples from Gaussian white noise by running a time-reversed Stochastic Differential Equation (SDE) whose drift coefficient depends on some probabilistic score. The discretization of such SDEs typically requires a large number of time steps and hence a high computational cost. This is because of ill-conditioning properties of the score that we analyze mathematically. Previous approaches have relied on multiscale generation to considerably accelerate SGMs. We explain how this acceleration results from an implicit factorization of the data distribution into a product of conditional probabilities of wavelet coefficients across scales. The resulting Wavelet Score-based Generative Model (WSGM) synthesizes wavelet coefficients with the same number of time steps at all scales, and its time complexity therefore grows linearly with the image size. This is proved mathematically for Gaussian distributions, and shown numerically for physical processes at phase transition and natural image datasets.


Maximum Likelihood Competitive Learning

Neural Information Processing Systems

One popular class of unsupervised algorithms are competitive algo(cid:173) rithms. In the traditional view of competition, only one competitor, the winner, adapts for any given case. I propose to view compet(cid:173) itive adaptation as attempting to fit a blend of simple probability generators (such as gaussians) to a set of data-points. The maxi(cid:173) mum likelihood fit of a model of this type suggests a "softer" form of competition, in which all competitors adapt in proportion to the relative probability that the input came from each competitor. I investigate one application of the soft competitive model, place(cid:173) ment of radial basis function centers for function interpolation, and show that the soft model can give better performance with little additional computational cost.


Bayesian Inference of Regular Grammar and Markov Source Models

Neural Information Processing Systems

In this paper we develop a Bayes criterion which includes the Rissanen complexity, for inferring regular grammar models. We develop two methods for regular grammar Bayesian inference. The fIrst method is based on treating the regular grammar as a I-dimensional Markov source, and the second is based on the combinatoric characteristics of the regular grammar itself. We apply the resulting Bayes criteria to a particular example in order to show the efficiency of each method.


Bayesian Model Comparison and Backprop Nets

Neural Information Processing Systems

The Bayesian model comparison framework is reviewed, and the Bayesian Occam's razor is explained. This framework can be applied to feedforward networks, making possible (1) objective comparisons between solutions using alternative network architectures; (2) objective choice of magnitude and type of weight decay terms; (3) quantified estimates of the error bars on network parameters and on network output. The framework also gen(cid:173) erates a measure of the effective number of parameters determined by the data. The relationship of Bayesian model comparison to recent work on pre(cid:173) diction of generalisation ability (Guyon et al., 1992, Moody, 1992) is dis(cid:173) cussed. In science, a central task is to develop and compare models to account for the data that are gathered.


Bayesian Learning via Stochastic Dynamics

Neural Information Processing Systems

The attempt to find a single "optimal" weight vector in conven(cid:173) tional network training can lead to overfitting and poor generaliza(cid:173) tion. Bayesian methods avoid this, without the need for a valida(cid:173) tion set, by averaging the outputs of many networks with weights sampled from the posterior distribution given the training data. This sample can be obtained by simulating a stochastic dynamical system that has the posterior as its stationary distribution. I view neural networks as probabilistic models, and learning as statistical inference. Conventional network learning finds a single "optimal" set of network parameter values, corresponding to maximum likelihood or maximum penalized likelihood in(cid:173) ference.