Goto

Collaborating Authors

 Bayesian Inference


Reliable Discretization of Deterministic Equations in Bayesian Networks

AAAI Conferences

We focus on the problem of modeling deterministic equations over continuous variables in discrete Bayesian networks. This is typically achieved by a discretization of both input and output variables and a degenerate quantification of the corresponding conditional probability tables. This approach, based on classical probabilities, cannot properly model the information loss induced by the discretization. We show that a reliable modeling of such epistemic uncertainty can be instead achieved by credal sets, i.e., convex sets of probability mass functions. This transforms the original Bayesian network in a credal network, possibly returning interval-valued inferences, that are robust with respect to the information loss induced by the discretisation. Algorithmic strategies for an optimal choice of the discretisation bins are also provided.


Hierarchical Classification With Bayesian Networks and Chained Classifiers

AAAI Conferences

In this work is proposed a method for Hierarchical Classification, which takes advantage of the hierarchical structure to influence the prediction of local classifiers with their neighbors. To achieve this, two strategies are combined. The first is to represent the hierarchical structure as a Bayesian network, and the second is to build chained classifiers that feed the Bayesian network as local classifiers. The proposed method was tested in several datasets of functional genomics, which consist of tree-structured hierarchies. The results of several variants of the proposed method are compared to the standard methods, Flat and Top-Down, as well as with a start of the art technique, showing superior performance under several metrics.


Output-Constrained Bayesian Neural Networks

arXiv.org Machine Learning

Bayesian neural network (BNN) priors are defined in parameter space, making it hard to encode prior knowledge expressed in function space. We formulate a prior that incorporates functional constraints about what the output can or cannot be in regions of the input space. Output-Constrained BNNs (OC-BNN) represent an interpretable approach of enforcing a range of constraints, fully consistent with the Bayesian framework and amenable to black-box inference. We demonstrate how OC-BNNs improve model robustness and prevent the prediction of infeasible outputs in two real-world applications of healthcare and robotics.


Information criteria for non-normalized models

arXiv.org Machine Learning

Many statistical models are given in the form of non-normalized densities with an intractable normalization constant. Since maximum likelihood estimation is computationally intensive for these models, several estimation methods have been developed which do not require explicit computation of the normalization constant, such as noise contrastive estimation (NCE) and score matching. However, model selection methods for general non-normalized models have not been proposed so far. In this study, we develop information criteria for non-normalized models estimated by NCE or score matching. They are derived as approximately unbiased estimators of discrepancy measures for non-normalized models. Experimental results demonstrate that the proposed criteria enable selection of the appropriate non-normalized model in a data-driven manner. Extension to a finite mixture of non-normalized models is also discussed.


Seismic Bayesian evidential learning: Estimation and uncertainty quantification of sub-resolution reservoir properties

arXiv.org Machine Learning

We present a framework that enables estimation of low-dimensional sub-resolution reservoir properties directly from seismic data, without requiring the solution of a high dimensional seismic inverse problem. Our workflow is based on the Bayesian evidential learning approach and exploits learning the direct relation between seismic data and reservoir properties to efficiently estimate reservoir properties. The theoretical framework we develop allows incorporation of non-linear statistical models for seismic estimation problems. Uncertainty quantification is performed with Approximate Bayesian Computation. With the help of a synthetic example of estimation of reservoir net-to-gross and average fluid saturations in sub-resolution thin-sand reservoir, several nuances are foregrounded regarding the applicability of unsupervised and supervised learning methods for seismic estimation problems. Finally, we demonstrate the efficacy of our approach by estimating posterior uncertainty of reservoir net-to-gross in sub-resolution thin-sand reservoir from an offshore delta dataset using 3D pre-stack seismic data.


Moment-Based Variational Inference for Markov Jump Processes

arXiv.org Machine Learning

We propose moment-based variational inference as a flexible framework for approximate smoothing of latent Markov jump processes. The main ingredient of our approach is to partition the set of all transitions of the latent process into classes. This allows to express the Kullback-Leibler divergence between the approximate and the exact posterior process in terms of a set of moment functions that arise naturally from the chosen partition. To illustrate possible choices of the partition, we consider special classes of jump processes that frequently occur in applications. We then extend the results to parameter inference and demonstrate the method on several examples.


Imputing Missing Events in Continuous-Time Event Streams

arXiv.org Machine Learning

Events in the world may be caused by other, unobserved events. We consider sequences of events in continuous time. Given a probability model of complete sequences, we propose particle smoothing---a form of sequential importance sampling---to impute the missing events in an incomplete sequence. We develop a trainable family of proposal distributions based on a type of bidirectional continuous-time LSTM: Bidirectionality lets the proposals condition on future observations, not just on the past as in particle filtering. Our method can sample an ensemble of possible complete sequences (particles), from which we form a single consensus prediction that has low Bayes risk under our chosen loss metric. We experiment in multiple synthetic and real domains, using different missingness mechanisms, and modeling the complete sequences in each domain with a neural Hawkes process (Mei & Eisner 2017). On held-out incomplete sequences, our method is effective at inferring the ground-truth unobserved events, with particle smoothing consistently improving upon particle filtering.


Approximate Bayesian computation via the energy statistic

arXiv.org Machine Learning

Approximate Bayesian computation (ABC) has become an essential part of the Bayesian toolbox for addressing problems in which the likelihood is prohibitively expensive or entirely unknown, making it intractable. ABC defines a quasi-posterior by comparing observed data with simulated data, traditionally based on some summary statistics, the elicitation of which is regarded as a key difficulty. In recent years, a number of data discrepancy measures bypassing the construction of summary statistics have been proposed, including the Kullback--Leibler divergence, the Wasserstein distance and maximum mean discrepancies. Here we propose a novel importance-sampling (IS) ABC algorithm relying on the so-called \textit{two-sample energy statistic}. We establish a new asymptotic result for the case where both the observed sample size and the simulated data sample size increase to infinity, which highlights to what extent the data discrepancy measure impacts the asymptotic pseudo-posterior. The result holds in the broad setting of IS-ABC methodologies, thus generalizing previous results that have been established only for rejection ABC algorithms. Furthermore, we propose a consistent V-statistic estimator of the energy statistic, under which we show that the large sample result holds. Our proposed energy statistic based ABC algorithm is demonstrated on a variety of models, including a Gaussian mixture, a moving-average model of order two, a bivariate beta and a multivariate $g$-and-$k$ distribution. We find that our proposed method compares well with alternative discrepancy measures.


A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders

arXiv.org Machine Learning

Recent studies have explored the use of deep generative models of speech spectra based of variational autoencoders (VAEs), combined with unsupervised noise models, to perform speech enhancement. These studies developed iterative algorithms involving either Gibbs sampling or gradient descent at each step, making them computationally expensive. This paper proposes a variational inference method to iteratively estimate the power spectrogram of the clean speech. Our main contribution is the analytical derivation of the variational steps in which the en-coder of the pre-learned VAE can be used to estimate the varia-tional approximation of the true posterior distribution, using the very same assumption made to train VAEs. Experiments show that the proposed method produces results on par with the afore-mentioned iterative methods using sampling, while decreasing the computational cost by a factor 36 to reach a given performance .


Variational approximations using Fisher divergence

arXiv.org Machine Learning

Modern applications of Bayesian inference involve models that are sufficiently complex that the corresponding posterior distributions are intractable and must be approximated. The most common approximation is based on Markov chain Monte Carlo, but these can be expensive when the data set is large and/or the model is complex, so more efficient variational approximations have recently received considerable attention. The traditional variational methods, that seek to minimize the Kullback--Leibler divergence between the posterior and a relatively simple parametric family, provide accurate and efficient estimation of the posterior mean, but often does not capture other moments, and have limitations in terms of the models to which they can be applied. Here we propose the construction of variational approximations based on minimizing the Fisher divergence, and develop an efficient computational algorithm that can be applied to a wide range of models without conjugacy or potentially unrealistic mean-field assumptions. We demonstrate the superior performance of the proposed method for the benchmark case of logistic regression.