Learning Graphical Models
Rao-Blackwellised Particle Filtering via Data Augmentation
Andrieu, Christophe, Freitas, Nando D., Doucet, Arnaud
SMC is often referred to as particle filtering (PF) in the context of computing filtering distributions for statistical inference and learning. It is known that the performance of PF often deteriorates in high-dimensional state spaces. In the past, we have shown that if a model admits partial analytical tractability, it is possible to combine PF with exact algorithms (Kalman filters, HMM filters, junction tree algorithm) to obtain efficient high dimensional filters (Doucet, de Freitas, Murphy and Russell 2000, Doucet, Godsill and Andrieu 2000). In particular, we exploited a marginalisation technique known as Rao-Blackwellisation (RB). Here, we attack a more complex model that does not admit immediate analytical tractability. This probabilistic model consists of Gaussian latent variables and binary observations.We show that by augmenting the model with artificial variables, it becomes possible to apply Rao-Blackwellisation and optimal sampling strategies. We focus on the problem of sequential binary classification (that is, when the data arrives one-at-a-time) using generic classifiers that consist of linear combinations of basis functions, whose coefficients evolve according to a Gaussian smoothness prior (Kitagawa and Gersch 1996). We have previously addressed this problem in the context of sequential fault detection in marine diesel engines (H0jen-S0rensen, de Freitas and Fog 2000). This application is of great importance as early detection of incipient faults can improve safety and efficiency, as well as, help to reduce downtime andplant maintenance in many industrial and transportation environments.
Means, Correlations and Bounds
Leisink, Martijn, Kappen, Bert
The partition function for a Boltzmann machine can be bounded from above and below. We can use this to bound the means and the correlations. For networks with small weights, the values of these statistics can be restricted to nontrivial regions (i.e. a subset of [-1, 1]). Experimental results show that reasonable bounding occurs for weight sizes where mean field expansions generally give good results. 1 Introduction Over the last decade, bounding techniques have become a popular tool to deal with graphical models that are too complex for exact computation. A nice property of bounds is that they give at least some information you can rely on.
Boosting and Maximum Likelihood for Exponential Models
Lebanon, Guy, Lafferty, John D.
We derive an equivalence between AdaBoost and the dual of a convex optimization problem, showing that the only difference between minimizing theexponential loss used by AdaBoost and maximum likelihood for exponential models is that the latter requires the model to be normalized toform a conditional probability distribution over labels. In addition to establishing a simple and easily understood connection between the two methods, this framework enables us to derive new regularization procedures for boosting that directly correspond to penalized maximum likelihood. Experiments on UCI datasets support our theoretical analysis andgive additional insight into the relationship between boosting and logistic regression.
Distribution of Mutual Information
The mutual information of two random variables z and J with joint probabilities {7rij} is commonly used in learning Bayesian nets as well as in many other fields. The chances 7rij are usually estimated by the empirical sampling frequency nij In leading to a point estimate J(nijIn) for the mutual information. To answer questions like "is J (nij In) consistent with zero?" or "what is the probability that the true mutual information is much larger than the point estimate?"
Probabilistic Inference of Hand Motion from Neural Activity in Motor Cortex
Gao, Yun, Black, Michael J., Bienenstock, Elie, Shoham, Shy, Donoghue, John P.
Statistical learning and probabilistic inference techniques are used to infer thehand position of a subject from multi-electrode recordings of neural activityin motor cortex. First, an array of electrodes provides training dataof neural firing conditioned on hand kinematics. We learn a nonparametric representationof this firing activity using a Bayesian model and rigorously compare it with previous models using cross-validation. Second, we infer a posterior probability distribution over hand motion conditioned on a sequence of neural test data using Bayesian inference. The learned firing models of multiple cells are used to define a non-Gaussian likelihood term which is combined with a prior probability for the kinematics. A particle filtering method is used to represent, update, and propagate the posterior distribution over time. The approach is compared withtraditional linear filtering methods; the results suggest that it may be appropriate for neural prosthetic applications.
ACh, Uncertainty, and Cortical Inference
Acetylcholine (ACh) has been implicated in a wide variety of tasks involving attentional processes and plasticity. Following extensive animal studies, it has previously been suggested that ACh reports on uncertainty and controls hippocampal, cortical and cortico-amygdalar plasticity. We extend this view and consider its effects on cortical representational inference, arguing that ACh controls the balance between bottom-up inference, influenced by input stimuli, and top-down inference, influenced by contextual information. We illustrate our proposal using a hierarchical hidden Markovmodel.
A Maximum-Likelihood Approach to Modeling Multisensory Enhancement
Multisensory response enhancement (MRE) is the augmentation of the response of a neuron to sensory input of one modality by simultaneous inputfrom another modality. The maximum likelihood (ML) model presented here modifies the Bayesian model for MRE (Anastasio et al.) by incorporating a decision strategy to maximize the number of correct decisions. Thus the ML model can also deal with the important tasks of stimulus discrimination and identification inthe presence of incongruent visual and auditory cues. It accounts for the inverse effectiveness observed in neurophysiological recordingdata, and it predicts a functional relation between uni-and bimodal levels of discriminability that is testable both in neurophysiological and behavioral experiments.
A Quantitative Model of Counterfactual Reasoning
Yarlett, Daniel, Ramscar, Michael
In this paper we explore two quantitative approaches to the modelling of counterfactual reasoning - a linear and a noisy-OR model - based on information containedin conceptual dependency networks. Empirical data is acquired in a study and the fit of the models compared to it. We conclude byconsidering the appropriateness of nonparametric approaches to counterfactual reasoning, and examining the prospects for other parametric approachesin the future.
Causal Categorization with Bayes Nets
A theory of categorization is presented in which knowledge of causal relationships between category features is represented as a Bayesian network. Referred to as causal-model theory, this theory predicts that objects are classified as category members to the extent they are likely to have been produced by a categorys causal model. On this view, people have models of the world that lead them to expect a certain distribution of features in category members (e.g., correlations between feature pairs that are directly connected by causal relationships), and consider exemplars good category members when they manifest those expectations. These expectations include sensitivity to higher-order feature interactions that emerge from the asymmetries inherent in causal relationships. Research on the topic of categorization has traditionally focused on the problem of learning new categories given observations of category members.