Bayesian Inference
The Rescorla-Wagner Algorithm and Maximum Likelihood Estimation of Causal Parameters
This paper analyzes generalization of the classic Rescorla-Wagner (R- W) learning algorithm and studies their relationship to Maximum Like- lihood estimation of causal parameters. We prove that the parameters of two popular causal models, P and P C, can be learnt by the same generalized linear Rescorla-Wagner (GLRW) algorithm provided gener- icity conditions apply. We characterize the fixed points of these GLRW algorithms and calculate the fluctuations about them, assuming that the input is a set of i.i.d. We describe how to determine convergence conditions and calculate conver- gence rates for the GLRW algorithms under these conditions.
Constraining a Bayesian Model of Human Visual Speed Perception
It has been demonstrated that basic aspects of human visual motion per- ception are qualitatively consistent with a Bayesian estimation frame- work, where the prior probability distribution on velocity favors slow speeds. Here, we present a refined probabilistic model that can account for the typical trial-to-trial variabilities observed in psychophysical speed perception experiments. We also show that data from such experiments can be used to constrain both the likelihood and prior functions of the model. Specifically, we measured matching speeds and thresholds in a two-alternative forced choice speed discrimination task. Parametric fits to the data reveal that the likelihood function is well approximated by a LogNormal distribution with a characteristic contrast-dependent vari- ance, and that the prior distribution on velocity exhibits significantly heavier tails than a Gaussian, and approximately follows a power-law function.
Unsupervised Variational Bayesian Learning of Nonlinear Models
In this paper we present a framework for using multi-layer per- ceptron (MLP) networks in nonlinear generative models trained by variational Bayesian learning. The nonlinearity is handled by linearizing it using a GaussโHermite quadrature at the hidden neu- rons. The method can be used to derive nonlinear coun- terparts for linear algorithms such as factor analysis, independent component/factor analysis and state-space models. This is demon- strated with a nonlinear factor analysis experiment in which even 20 sources can be estimated from a real world speech data set.
Maximum Likelihood Estimation of Intrinsic Dimension
We propose a new method for estimating intrinsic dimension of a dataset derived by applying the principle of maximum likelihood to the distances between close neighbors. We derive the estimator by a Poisson process approximation, assess its bias and variance theo- retically and by simulations, and apply it to a number of simulated and real datasets. We also show it has the best overall performance compared with two other intrinsic dimension estimators.
A Machine Learning Approach to Conjoint Analysis
Choice-based conjoint analysis builds models of consumer preferences over products with answers gathered in questionnaires. Our main goal is to bring tools from the machine learning community to solve this prob- lem more efficiently. Thus, we propose two algorithms to quickly and accurately estimate consumer preferences. Conjoint analysis (also called trade-off analysis) is one of the most popular marketing re- search technique used to determine which features a new product should have, by conjointly measuring consumers trade-offs between discretized1 attributes. In this paper, we will fo- cus on the choice-based conjoint analysis (CBC) framework [11] since it is both widely used and realistic: at each question in the survey, the consumer is asked to choose one product from several.
Learning Gaussian Process Kernels via Hierarchical Bayes
We present a novel method for learning with Gaussian process regres- sion in a hierarchical Bayesian framework. In a first step, kernel matri- ces on a fixed set of input points are learned from data using a simple and efficient EM algorithm. This step is nonparametric, in that it does not require a parametric form of covariance function. In a second step, kernel functions are fitted to approximate the learned covariance matrix using a generalized Nystrom method, which results in a complex, data driven kernel. We evaluate our approach as a recommendation engine for art images, where the proposed hierarchical Bayesian method leads to excellent prediction performance.
Hierarchical Bayesian Inference in Networks of Spiking Neurons
There is growing evidence from psychophysical and neurophysiological studies that the brain utilizes Bayesian principles for inference and de- cision making. An important open question is how Bayesian inference for arbitrary graphical models can be implemented in networks of spik- ing neurons. In this paper, we show that recurrent networks of noisy integrate-and-fire neurons can perform approximate Bayesian inference for dynamic and hierarchical graphical models. The membrane potential dynamics of neurons is used to implement belief propagation in the log domain. The spiking probability of a neuron is shown to approximate the posterior probability of the preferred state encoded by the neuron, given past inputs.
Nonparametric inference of prior probabilities from Bayes-optimal behavior
We discuss a method for obtaining a subject's a priori beliefs from his/her behavior in a psychophysics context, under the assumption that the behavior is (nearly) optimal from a Bayesian perspective. The method is nonparametric in the sense that we do not assume that the prior belongs to any fixed class of distributions (e.g., Gaussian). Despite this increased generality, the method is relatively simple to implement, being based in the simplest case on a linear programming algorithm, and more generally on a straightforward maximum likelihood or maximum a posteriori formulation, which turns out to be a convex optimization problem (with no non-global local maxima) in many important cases. In addition, we develop methods for analyzing the uncertainty of these esti- mates. We demonstrate the accuracy of the method in a simple simulated coin-flipping setting; in particular, the method is able to precisely track the evolution of the subject's posterior distribution as more and more data are observed.
Bayesian models of human action understanding
We present a Bayesian framework for explaining how people reason about and predict the actions of an intentional agent, based on observ- ing its behavior. Action-understanding is cast as a problem of inverting a probabilistic generative model, which assumes that agents tend to act rationally in order to achieve their goals given the constraints of their en- vironment. Working in a simple sprite-world domain, we show how this model can be used to infer the goal of an agent and predict how the agent will act in novel situations or when environmental constraints change. The model provides a qualitative account of several kinds of inferences that preverbal infants have been shown to perform, and also fits quantita- tive predictions that adult observers make in a new experiment.
Prediction and Change Detection
We measure the ability of human observers to predict the next datum in a sequence that is generated by a simple statistical process undergoing change at random points in time. Accurate performance in this task requires the identification of changepoints. We assess individual differences between observers both empirically, and using two kinds of models: a Bayesian approach for change detection and a family of cognitively plausible fast and frugal models. Some individuals detect too many changes and hence perform sub-optimally due to excess variability. Other individuals do not detect enough changes, and perform sub-optimally because they fail to notice short-term temporal trends.