Goto

Collaborating Authors

 Bayesian Learning


Dynamic Structure Super-Resolution

Neural Information Processing Systems

The problem of super-resolution involves generating feasible higher resolution images, which are pleasing to the eye and realistic, from a given low resolution image. This might be attempted by using simplefilters for smoothing out the high resolution blocks or through applications where substantial prior information is used to imply the textures and shapes which will occur in the images. In this paper we describe an approach which lies between the two extremes. It is a generic unsupervised method which is usable in all domains, but goes beyond simple smoothing methods in what it achieves. We use a dynamic treelike architecture to model the high resolution data. Approximate conditioning on the low resolution image is achieved through a mean field approach.


Bayesian Image Super-Resolution

Neural Information Processing Systems

The extraction of a single high-quality image from a set of lowresolution imagesis an important problem which arises in fields such as remote sensing, surveillance, medical imaging and the extraction ofstill images from video. Typical approaches are based on the use of cross-correlation to register the images followed by the inversion of the transformation from the unknown high resolution imageto the observed low resolution images, using regularization toresolve the ill-posed nature of the inversion process. In this paper we develop a Bayesian treatment of the super-resolution problem in which the likelihood function for the image registration parametersis based on a marginalization over the unknown high-resolution image. This approach allows us to estimate the unknown point spread function, and is rendered tractable through the introduction of a Gaussian process prior over images. Results indicate a significant improvement over techniques based on MAP (maximum a-posteriori) point optimization of the high resolution image and associated registration parameters. 1 Introduction The task in super-resolution is to combine a set of low resolution images of the same scene in order to obtain a single image of higher resolution. Provided the individual low resolution images have sub-pixel displacements relative to each other, it is possible to extract high frequency details of the scene well beyond the Nyquist limit of the individual source images.


Learning Graphical Models with Mercer Kernels

Neural Information Processing Systems

We present a class of algorithms for learning the structure of graphical models from data. The algorithms are based on a measure known as the kernel generalized variance (KGV), which essentially allows us to treat all variables on an equal footing as Gaussians in a feature space obtained from Mercer kernels. Thus we are able to learn hybrid graphs involving discrete and continuous variables of arbitrary type. We explore the computational properties of our approach, showing how to use the kernel trick to compute the relevant statistics in linear time. We illustrate our framework with experiments involving discrete and continuous data.


Discriminative Learning for Label Sequences via Boosting

Neural Information Processing Systems

Well-known applications include part-of-speech (POS) tagging, named entity classification, information extraction,text segmentation and phoneme classification in text and speech processing [7] as well as problems like protein homology detection, secondary structure prediction or gene classification in computational biology [3]. Up to now, the predominant formalism for modeling and predicting label sequences has been based on Hidden Markov Models (HMMs) and variations thereof. Yet, despite its success, generative probabilistic models - of which HMMs are a special case - have two major shortcomings, which this paper is not the first one to point out. First, generative probabilistic models are typically trained using maximum likelihood estimation (MLE) for a joint sampling model of observation and label sequences. As has been emphasized frequently, MLE based on the joint probability model is inherently non-discriminative and thus may lead to suboptimal prediction accuracy. Secondly, efficient inference and learning in this setting often requires to make questionable conditional independence assumptions.


A Differential Semantics for Jointree Algorithms

Neural Information Processing Systems

A new approach to inference in belief networks has been recently proposed, which is based on an algebraic representation of belief networks using multi-linear functions. According to this approach, the key computational question is that of representing multi-linear functions compactly, since inference reduces to a simple process of ev aluating and differentiating such functions. W e show here that mainstream inference algorithms based on jointrees are a special case of this approach in a v ery precise sense. W e use this result to prov e new properties of jointree algorithms, and then discuss some of its practical and theoretical implications.


VIBES: A Variational Inference Engine for Bayesian Networks

Neural Information Processing Systems

In recent years variational methods have become a popular tool for approximate inference and learning in a wide variety of probabilistic models.For each new application, however, it is currently necessary first to derive the variational update equations, and then to implement them in application-specific code. Each of these steps is both time consuming and error prone. In this paper we describe a general purpose inference engine called VIBES ('Variational Inference forBayesian Networks') which allows a wide variety of probabilistic modelsto be implemented and solved variationally without recourse to coding. New models are specified either through a simple script or via a graphical interface analogous to a drawing package. VIBES then automatically generates and solves the variational equations.We illustrate the power and flexibility of VIBES using examples from Bayesian mixture modelling.


Regularized Greedy Importance Sampling

Neural Information Processing Systems

Greedy importance sampling is an unbiased estimation technique that reduces thevariance of standard importance sampling by explicitly searching for modes in the estimation objective. Previous work has demonstrated thefeasibility of implementing this method and proved that the technique is unbiased in both discrete and continuous domains. In this paper we present a reformulation of greedy importance sampling that eliminates the free parameters from the original estimator, and introduces a new regularization strategy that further reduces variance without compromising unbiasedness.The resulting estimator is shown to be effective for difficult estimation problems arising in Markov random field inference. Inparticular, improvements are achieved over standard MCMC estimators when the distribution has multiple peaked modes.


Adaptive Classification by Variational Kalman Filtering

Neural Information Processing Systems

We propose in this paper a probabilistic approach for adaptive inference of generalized nonlinear classification that combines the computational advantage of a parametric solution with the flexibility of sequential sampling techniques.We regard the parameters of the classifier as latent states in a first order Markov process and propose an algorithm which can be regarded as variational generalization of standard Kalman filtering. Thevariational Kalman filter is based on two novel lower bounds that enable us to use a non-degenerate distribution over the adaptation rate. An extensive empirical evaluation demonstrates that the proposed method is capable of infering competitive classifiers both in stationary and non-stationary environments. Although we focus on classification, the algorithm is easily extended to other generalized nonlinear models.


Dynamic Bayesian Networks with Deterministic Latent Tables

Neural Information Processing Systems

The application of latent/hidden variable Dynamic Bayesian Networks isconstrained by the complexity of marginalising over latent variables. For this reason either small latent dimensions or Gaussian latentconditional tables linearly dependent on past states are typically considered in order that inference is tractable. We suggest an alternative approach in which the latent variables are modelled using deterministic conditional probability tables.


Independent Components Analysis through Product Density Estimation

Neural Information Processing Systems

We present a simple direct approach for solving the ICA problem, using density estimation and maximum likelihood. Given a candidate orthogonalframe, we model each of the coordinates using a semi-parametric density estimate based on cubic splines. Since our estimates have two continuous derivatives, we can easily run a second ordersearch for the frame parameters. Our method performs very favorably when compared to state-of-the-art techniques. 1 Introduction Independent component analysis (ICA) is a popular enhancement over principal component analysis (PCA) and factor analysis. IRP which is assumed to arise from a linear mixing of a latent random source vector S E IRP, (1) X AS; the components Sj, j 1, ...,p of S are assumed to be independently distributed.