Learning Graphical Models
Robust and Trend Following Student's t Kalman Smoothers
Aravkin, Aleksandr Y., Burke, James V., Pillonetto, Gianluigi
We present a Kalman smoothing framework based on modeling errors using the heavy tailed Student's t distribution, along with algorithms, convergence theory, open-source general implementation, and several important applications. The computational effort per iteration grows linearly with the length of the time series, and all smoothers allow nonlinear process and measurement models. Robust smoothers form an important subclass of smoothers within this framework. These smoothers work in situations where measurements are highly contaminated by noise or include data unexplained by the forward model. Highly robust smoothers are developed by modeling measurement errors using the Student's t distribution, and outperform the recently proposed L1-Laplace smoother in extreme situations with data containing 20% or more outliers. A second special application we consider in detail allows tracking sudden changes in the state. It is developed by modeling process noise using the Student's t distribution, and the resulting smoother can track sudden changes in the state. These features can be used separately or in tandem, and we present a general smoother algorithm and open source implementation, together with convergence analysis that covers a wide range of smoothers. A key ingredient of our approach is a technique to deal with the non-convexity of the Student's t loss function. Numerical results for linear and nonlinear models illustrate the performance of the new smoothers for robust and tracking applications, as well as for mixed problems that have both types of features.
Bayesian Compressed Regression
Guhaniyogi, Rajarshi, Dunson, David B.
As an alternative to variable selection or shrinkage in high dimensional regression, we propose to randomly compress the predictors prior to analysis. This dramatically reduces storage and computational bottlenecks, performing well when the predictors can be projected to a low dimensional linear subspace with minimal loss of information about the response. As opposed to existing Bayesian dimensionality reduction approaches, the exact posterior distribution conditional on the compressed data is available analytically, speeding up computation by many orders of magnitude while also bypassing robustness issues due to convergence and mixing problems with MCMC. Model averaging is used to reduce sensitivity to the random projection matrix, while accommodating uncertainty in the subspace dimension. Strong theoretical support is provided for the approach by showing near parametric convergence rates for the predictive density in the large p small n asymptotic paradigm. Practical performance relative to competitors is illustrated in simulations and real data applications.
Deep Gaussian Processes
Damianou, Andreas C., Lawrence, Neil D.
In this paper we introduce deep Gaussian process (GP) models. Deep GPs are a deep belief network based on Gaussian process mappings. The data is modeled as the output of a multivariate GP. The inputs to that Gaussian process are then governed by another GP. A single layer model is equivalent to a standard GP or the GP latent variable model (GP-LVM). We perform inference in the model by approximate variational marginalization. This results in a strict lower bound on the marginal likelihood of the model which we use for model selection (number of layers and nodes per layer). Deep belief networks are typically applied to relatively large data sets using stochastic gradient descent for optimization. Our fully Bayesian treatment allows for the application of deep models even when data is scarce. Model selection by our variational bound shows that a five layer hierarchy is justified even when modelling a digit data set containing only 150 examples.
Modeling the Dynamics of Nonverbal Behavior on Interpersonal Trust for Human-Robot Interactions
Lee, Jin Joo (Massachusetts Institute of Technology) | Knox, Brad (Massachusetts Institute of Technology) | Breazeal, Cynthia (Massachusetts Institute of Technology)
We describe research towards creating a computational model for recognizing interpersonal trust in social interactions. We found that four negative gestural cues— leaning-backward, face-touching, hand-touching, and crossing-arms—are together predictive of lower levels of trust. Three positive gestural cues—leaning- forward, having arms-in-lap, and open-arms—are predictive of higher levels of trust. We train a probabilistic graphical model using natural social interaction data, a “Trust Hidden Markov Model” that incorporates the occurrence of these seven important gestures throughout the social interaction. This Trust HMM predicts with 69.44% accuracy whether an individual is willing to behave cooperatively or uncooperatively with their novel partner; in comparison, a gesture-ignorant model achieves 63.89% accuracy. We attempt to automate this recognition process by detecting those trust-related behaviors through 3D motion capture technology and gesture recognition algorithms. We aim to eventually create a hierarchical system—with low-level gesture recognition for high-level trust recognition—that is capable of predicting whether an individual finds another to be a trustworthy or untrustworthy partner through their non- verbal expressions.
ARCO1: An Application of Belief Networks to the Oil Market
Belief networks are a new, potentially important, class of knowledge-based models. ARCO1, currently under development at the Atlantic Richfield Company (ARCO) and the University of Southern California (USC), is the most advanced reported implementation of these models in a financial forecasting setting. ARCO1's underlying belief network models the variables believed to have an impact on the crude oil market. A pictorial market model-developed on a MAC II- facilitates consensus among the members of the forecasting team. The system forecasts crude oil prices via Monte Carlo analyses of the network. Several different models of the oil market have been developed; the system's ability to be updated quickly highlights its flexibility.
On the Generation of Alternative Explanations with Implications for Belief Revision
Department of Computer Science Brown University Providence, RI 02912 Abstract In general, the best explanation for a given observation makes no promises on how good it is with respect to other alternative explanations. A major deficiency of message-passing schemes for belief revision in Bayesian networks is their inability to generate alternatives beyond the second best. In this paper, we present a general approach based on linear constraint systems that naturally generates alternative explanations in an orderly and highly efficient manner. This approach is then applied to cost-based abduction problems as well as belief revision in Bayesian networks. INTRODUCTION We are constantly faced with the problem of explaining the observations we have gathered with our senses. Our explanations are constructed by assuming certain facts or hypotheses which support our observations. For example, suppose I decide to phone my friend Tony at the office.
From Relational Databases to Belief Networks
The relationship between belief networks and relational databases is examined. Based on this analysis, a method to construct belief networks automatically from statistical relational data is proposed. A comparison between our method and other methods shows that our method has several advantages when generalization or prediction is deeded.
Compressed Constraints in Probabilistic Logic and Their Revision
In probabilistic logic entailments, even moderate size problems can yield linear constraint systems with so many variables that exact methods are impractical. This difficulty can be remedied in many cases of interest by introducing a threevalued logic (true, false, and "don't care"). The three-valued approach allows the construction of "compressed" constraint systems which have the same solution sets as their two-valued counterparts, but which may involve dramatically fewer variables. PROLIFERATION OF WORLDS An entailment problem in Nilsson's (1986) probabilistic logic derives an estimate for the prior probability of one sentence (hereafter, the "target") from the priors for a set of other ("source") sentences. V is a matrix derived from an inventory of all consistent patterns of truth assignments (1 true, 0 false) for the source and target sentences.
Algorithms for Irrelevance-Based Partial MAPs
Irrelevance-based partial MAPs are useful constructs for domain-independent explanation using belief networks. We look at two definitions for such partial MAPs, and prove important properties that are useful in designing algorithms for computing them effectively. We make use of these properties in modifying our standard MAP best-first algorithm, so as to handle irrelevance-based partial MAPs.
A Fusion Algorithm for Solving Bayesian Decision Problems
This paper proposes a new method for solving Bayesian decision problems. The method consists of representing a Bayesian decision problem as a valuation-based system and applying a fusion algorithm for solving it. The fusion algorithm is a hybrid of local computational methods for computation of marginals of joint probability distributions and the local computational methods for discrete optimization problems.