Goto

Collaborating Authors

 Bayesian Inference


Theory Refinement on Bayesian Networks

arXiv.org Artificial Intelligence

Theory refinement is the task of updating a domain theory in the light of new cases, to be done automatically or with some expert assistance. The problem of theory refinement under uncertainty is reviewed here in the context of Bayesian statistics, a theory of belief revision. The problem is reduced to an incremental learning task as follows: the learning system is initially primed with a partial theory supplied by a domain expert, and thereafter maintains its own internal representation of alternative theories which is able to be interrogated by the domain expert and able to be incrementally refined from data. Algorithms for refinement of Bayesian networks are presented to illustrate what is meant by "partial theory", "alternative theory representation", etc. The algorithms are an incremental variant of batch learning algorithms from the literature so can work well in batch and incremental mode.


Some Properties of Plausible Reasoning

arXiv.org Artificial Intelligence

This paper presents a plausible reasoning system to illustrate some broad issues in knowledge representation: dualities between different reasoning forms, the difficulty of unifying complementary reasoning styles, and the approximate nature of plausible reasoning. These issues have a common underlying theme: there should be an underlying belief calculus of which the many different reasoning forms are special cases, sometimes approximate. The system presented allows reasoning about defaults, likelihood, necessity and possibility in a manner similar to the earlier work of Adams. The system is based on the belief calculus of subjective Bayesian probability which itself is based on a few simple assumptions about how belief should be manipulated. Approximations, semantics, consistency and consequence results are presented for the system. While this puts these often discussed plausible reasoning forms on a probabilistic footing, useful application to practical problems remains an issue.


Constraint Propagation with Imprecise Conditional Probabilities

arXiv.org Artificial Intelligence

An approach to reasoning with default rules where the proportion of exceptions, or more generally the probability of encountering an exception, can be at least roughly assessed is presented. It is based on local uncertainty propagation rules which provide the best bracketing of a conditional probability of interest from the knowledge of the bracketing of some other conditional probabilities. A procedure that uses two such propagation rules repeatedly is proposed in order to estimate any simple conditional probability of interest from the available knowledge. The iterative procedure, that does not require independence assumptions, looks promising with respect to the linear programming method. Improved bounds for conditional probabilities are given when independence assumptions hold.


"Conditional Inter-Causally Independent" Node Distributions, a Property of "Noisy-Or" Models

arXiv.org Artificial Intelligence

This paper examines the interdependence generated between two parent nodes with a common instantiated child node, such as two hypotheses sharing common evidence. The relation so generated has been termed "intercausal." It is shown by construction that inter-causal independence is possible for binary distributions at one state of evidence. For such "CICI" distributions, the two measures of inter-causal effect, "multiplicative synergy" and "additive synergy" are equal. The well known "noisy-or" model is an example of such a distribution. This introduces novel semantics for the noisy-or, as a model of the degree of conflict among competing hypotheses of a common observation. In a general Bayesian network, the relation between a pair of nodes can be predictive, meaning we are interested in the effect of a node upon its successors, or, oppositely, diagnostic, where we infer the state of a node from knowledge of its successors.


Modeling a Sensor to Improve its Efficacy

arXiv.org Machine Learning

Robots rely on sensors to provide them with information about their surroundings. However, high-quality sensors can be extremely expensive and cost-prohibitive. Thus many robotic systems must make due with lower-quality sensors. Here we demonstrate via a case study how modeling a sensor can improve its efficacy when employed within a Bayesian inferential framework. As a test bed we employ a robotic arm that is designed to autonomously take its own measurements using an inexpensive LEGO light sensor to estimate the position and radius of a white circle on a black field. The light sensor integrates the light arriving from a spatially distributed region within its field of view weighted by its Spatial Sensitivity Function (SSF). We demonstrate that by incorporating an accurate model of the light sensor SSF into the likelihood function of a Bayesian inference engine, an autonomous system can make improved inferences about its surroundings. The method presented here is data-based, fairly general, and made with plug-and play in mind so that it could be implemented in similar problems.


Generalized Thompson Sampling for Sequential Decision-Making and Causal Inference

arXiv.org Artificial Intelligence

Recently, it has been shown how sampling actions from the predictive distribution over the optimal action-sometimes called Thompson sampling-can be applied to solve sequential adaptive control problems, when the optimal policy is known for each possible environment. The predictive distribution can then be constructed by a Bayesian superposition of the optimal policies weighted by their posterior probability that is updated by Bayesian inference and causal calculus. Here we discuss three important features of this approach. First, we discuss in how far such Thompson sampling can be regarded as a natural consequence of the Bayesian modeling of policy uncertainty. Second, we show how Thompson sampling can be used to study interactions between multiple adaptive agents, thus, opening up an avenue of game-theoretic analysis. Third, we show how Thompson sampling can be applied to infer causal relationships when interacting with an environment in a sequential fashion. In summary, our results suggest that Thompson sampling might not merely be a useful heuristic, but a principled method to address problems of adaptive sequential decision-making and causal inference.


Sparse estimation via nonconcave penalized likelihood in a factor analysis model

arXiv.org Machine Learning

We consider the problem of sparse estimation in a factor analysis model. A traditional estimation procedure in use is the following two-step approach: the model is estimated by maximum likelihood method and then a rotation technique is utilized to find sparse factor loadings. However, the maximum likelihood estimates cannot be obtained when the number of variables is much larger than the number of observations. Furthermore, even if the maximum likelihood estimates are available, the rotation technique does not often produce a sufficiently sparse solution. In order to handle these problems, this paper introduces a penalized likelihood procedure that imposes a nonconvex penalty on the factor loadings. We show that the penalized likelihood procedure can be viewed as a generalization of the traditional two-step approach, and the proposed methodology can produce sparser solutions than the rotation technique. A new algorithm via the EM algorithm along with coordinate descent is introduced to compute the entire solution path, which permits the application to a wide variety of convex and nonconvex penalties. Monte Carlo simulations are conducted to investigate the performance of our modeling strategy. A real data example is also given to illustrate our procedure.


Variational Semi-blind Sparse Deconvolution with Orthogonal Kernel Bases and its Application to MRFM

arXiv.org Machine Learning

We present a variational Bayesian method of joint image reconstruction and point spread function (PSF) estimation when the PSF of the imaging device is only partially known. To solve this semi-blind deconvolution problem, prior distributions are specified for the PSF and the 3D image. Joint image reconstruction and PSF estimation is then performed within a Bayesian framework, using a variational algorithm to estimate the posterior distribution. The image prior distribution imposes an explicit atomic measure that corresponds to image sparsity. Importantly, the proposed Bayesian deconvolution algorithm does not require hand tuning. Simulation results clearly demonstrate that the semi-blind deconvolution algorithm compares favorably with previous Markov chain Monte Carlo (MCMC) version of myopic sparse reconstruction. It significantly outperforms mismatched non-blind algorithms that rely on the assumption of the perfect knowledge of the PSF. The algorithm is illustrated on real data from magnetic resonance force microscopy (MRFM).


Herded Gibbs Sampling

arXiv.org Machine Learning

The Gibbs sampler is one of the most popular algorithms for inference in statistical models. In this paper, we introduce a herding variant of this algorithm, called herded Gibbs, that is entirely deterministic. We prove that herded Gibbs has an $O(1/T)$ convergence rate for models with independent variables and for fully connected probabilistic graphical models. Herded Gibbs is shown to outperform Gibbs in the tasks of image denoising with MRFs and named entity recognition with CRFs. However, the convergence for herded Gibbs for sparsely connected probabilistic graphical models is still an open problem.


EigenGP: Sparse Gaussian process models with data-dependent eigenfunctions

arXiv.org Machine Learning

Gaussian processes (GPs) provide a nonparametric representation of functions. However, classical GP inference suffers from high computational cost and it is difficult to design nonstationary GP priors in practice. In this paper, we propose a sparse Gaussian process model, EigenGP, based on the Karhunen-Loève (KL) expansion of a GP prior. We use the Nyström approximation to obtain data dependent eigenfunctions and select these eigenfunctions by evidence maximization. This selection reduces the number of eigenfunctions in our model and provides a nonstationary covariance function. To handle nonlinear likelihoods, we develop an efficient expectation propagation (EP) inference algorithm, and couple it with expectation maximization for eigenfunction selection. Because the eigenfunctions of a Gaussian kernel are associated with clusters of samples - including both the labeled and unlabeled - selecting relevant eigenfunctions enables EigenGP to conduct semi-supervised learning. Our experimental results demonstrate improved predictive performance of EigenGP over alternative state-of-the-art sparse GP and semisupervised learning methods for regression, classification, and semisupervised classification.