Goto

Collaborating Authors

 Undirected Networks


Inference in Probabilistic Logic Programs with Continuous Random Variables

arXiv.org Artificial Intelligence

Probabilistic Logic Programming (PLP), exemplified by Sato and Kameya's PRISM, Poole's ICL, Raedt et al's ProbLog and Vennekens et al's LPAD, is aimed at combining statistical and logical knowledge representation and inference. A key characteristic of PLP frameworks is that they are conservative extensions to non-probabilistic logic programs which have been widely used for knowledge representation. PLP frameworks extend traditional logic programming semantics to a distribution semantics, where the semantics of a probabilistic logic program is given in terms of a distribution over possible models of the program. However, the inference techniques used in these works rely on enumerating sets of explanations for a query answer. Consequently, these languages permit very limited use of random variables with continuous distributions. In this paper, we present a symbolic inference procedure that uses constraints and represents sets of explanations without enumeration. This permits us to reason over PLPs with Gaussian or Gamma-distributed random variables (in addition to discrete-valued random variables) and linear equality constraints over reals. We develop the inference procedure in the context of PRISM; however the procedure's core ideas can be easily applied to other PLP languages as well. An interesting aspect of our inference procedure is that PRISM's query evaluation process becomes a special case in the absence of any continuous random variables in the program. The symbolic inference procedure enables us to reason over complex probabilistic models such as Kalman filters and a large subclass of Hybrid Bayesian networks that were hitherto not possible in PLP frameworks. (To appear in Theory and Practice of Logic Programming).


Decision-Theoretic Coordination and Control for Active Multi-Camera Surveillance in Uncertain, Partially Observable Environments

arXiv.org Artificial Intelligence

A central problem of surveillance is to monitor multiple targets moving in a large-scale, obstacle-ridden environment with occlusions. This paper presents a novel principled Partially Observable Markov Decision Process-based approach to coordinating and controlling a network of active cameras for tracking and observing multiple mobile targets at high resolution in such surveillance environments. Our proposed approach is capable of (a) maintaining a belief over the targets' states (i.e., locations, directions, and velocities) to track them, even when they may not be observed directly by the cameras at all times, (b) coordinating the cameras' actions to simultaneously improve the belief over the targets' states and maximize the expected number of targets observed with a guaranteed resolution, and (c) exploiting the inherent structure of our surveillance problem to improve its scalability (i.e., linear time) in the number of targets to be observed. Quantitative comparisons with state-of-the-art multi-camera coordination and control techniques show that our approach can achieve higher surveillance quality in real time. The practical feasibility of our approach is also demonstrated using real AXIS 214 PTZ cameras


Subset Selection for Gaussian Markov Random Fields

arXiv.org Machine Learning

Given the joint distribution of a set of random variables (in the form of a Markov random field), we consider the problem of selecting a small subset of these variables to observe so as to accurately predict the remaining unobserved variables. We focus here on Gaussian processes(Rasmussen and Williams, 2006) on graphs, i.e., Gaussian Markov random fields(Gaussian MRFs). Our aim in this paper is to give a subset selection algorithm which, given a budget for the number of variables that can be observed, minimizes the expected squared prediction error averaged over all the variables. We are particularly interested in algorithms with provable guarantees on the prediction error. Our main focus is on Gaussian MRFs on trees and other treelike graphs, or to be precise, bounded tree-width graphs--such graphs have been widely studied in the context of inference, see, e.g., Sudderth (2002). We also consider a special class of Gaussian MRFs, called Gaussian free fields (or GFFs), which arise, among others, in computer vision, see, e.g., Szeliski (1990). We first explain the notation we use and formally state our problem before describing how our work relates to previous research.


Optimal Weighting of Multi-View Data with Low Dimensional Hidden States

arXiv.org Machine Learning

In areas like Natural Language Processing, data often have multi-view and high dimension. Recently, CCA [8] has been applied to the multi-view setting as a unsupervised dimension reduction method in [7][10][3] with performance guarantee if the data is generated under certain structure. In [7], they assume the high dimensional multi-view data is generated independently conditioning on a low dimensional hidden state (the model structure will be illustrated later in detail). Under this assumption, the low dimensional features provided by CCA won't lose any useful information compared with the original high dimensional features when applied to linear regression. Also, [6] has applied this CCA method to generate a low dimensional vector representation of words which works well in a lot of NLP tasks. The reason for CCA to work well is that the low dimensional hidden state (throughout the paper we'll use k to denote the dimension of hidden state) 1 contains most information for the supervised tasks and by doing CCA, we are able to generate k dimensional estimate of the hidden state from each view as mentioned by [4], or more precisely, we can find all k directions in the high dimensional space of each view that have nonzero correlation with the hidden state via CCA. Only two views are enough to implement the CCA algorithms above (see [7] for detailed introduction about CCA). Despite it's power in dimension reduction, CCA with two views is still not optimal in the sense that it ends up with a hidden state estimator from each view but it's impossible to tell which view is better by only looking at the two views.


Bellman Error Based Feature Generation using Random Projections on Sparse Spaces

arXiv.org Machine Learning

We address the problem of automatic generation of features for value function approximation. Bellman Error Basis Functions (BEBFs) have been shown to improve the error of policy evaluation with function approximation, with a convergence rate similar to that of value iteration. We propose a simple, fast and robust algorithm based on random projections to generate BEBFs for sparse feature spaces. We provide a finite sample analysis of the proposed method, and prove that projections logarithmic in the dimension of the original space are enough to guarantee contraction in the error. Empirical results demonstrate the strength of this method.


Application of Fuzzy Mathematics to Speech-to-Text Conversion by Elimination of Paralinguistic Content

arXiv.org Artificial Intelligence

For the past few decades, man has been trying to create an intelligent computer which can talk and respond like he can. The task of creating a system that can talk like a human being is the primary objective of Automatic Speech Recognition. Various Speech Recognition techniques have been developed in theory and have been applied in practice. This paper discusses the problems that have been encountered in developing Speech Recognition, the techniques that have been applied to automate the task, and a representation of the core problems of present day Speech Recognition by using Fuzzy Mathematics.


Regret Bounds for Restless Markov Bandits

arXiv.org Machine Learning

We consider the restless Markov bandit problem, in which the state of each arm evolves according to a Markov process independently of the learner's actions. We suggest an algorithm that after $T$ steps achieves $\tilde{O}(\sqrt{T})$ regret with respect to the best policy that knows the distributions of all arms. No assumptions on the Markov chains are made except that they are irreducible. In addition, we show that index-based policies are necessarily suboptimal for the considered problem.


Bayesian Nonparametric Hidden Semi-Markov Models

arXiv.org Machine Learning

There is much interest in the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) as a natural Bayesian nonparametric extension of the ubiquitous Hidden Markov Model for learning from sequential and time-series data. However, in many settings the HDP-HMM's strict Markovian constraints are undesirable, particularly if we wish to learn or encode non-geometric state durations. We can extend the HDP-HMM to capture such structure by drawing upon explicit-duration semi-Markov modeling, which has been developed mainly in the parametric non-Bayesian setting, to allow construction of highly interpretable models that admit natural prior information on state durations. In this paper we introduce the explicit-duration Hierarchical Dirichlet Process Hidden semi-Markov Model (HDP-HSMM) and develop sampling algorithms for efficient posterior inference. The methods we introduce also provide new methods for sampling inference in the finite Bayesian HSMM. Our modular Gibbs sampling methods can be embedded in samplers for larger hierarchical Bayesian models, adding semi-Markov chain modeling as another tool in the Bayesian inference toolbox. We demonstrate the utility of the HDP-HSMM and our inference methods on both synthetic and real experiments.


A Method of Moments for Mixture Models and Hidden Markov Models

arXiv.org Machine Learning

Mixture models are a fundamental tool in applied statistics and machine learning for treating data taken from multiple subpopulations. The current practice for estimating the parameters of such models relies on local search heuristics (e.g., the EM algorithm) which are prone to failure, and existing consistent methods are unfavorable due to their high computational and sample complexity which typically scale exponentially with the number of mixture components. This work develops an efficient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixtures of axis-aligned Gaussians) and hidden Markov models. The new method leads to rigorous unsupervised learning results for mixture models that were not achieved by previous works; and, because of its simplicity, it offers a viable alternative to EM for practical deployment.


Contextually Guided Semantic Labeling and Search for 3D Point Clouds

arXiv.org Artificial Intelligence

RGB-D cameras, which give an RGB image to- gether with depths, are becoming increasingly popular for robotic perception. In this paper, we address the task of detecting commonly found objects in the 3D point cloud of indoor scenes obtained from such cameras. Our method uses a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurence relationships and geometric relationships. With a large number of object classes and relations, the model's parsimony becomes important and we address that by using multiple types of edge potentials. We train the model using a maximum-margin learning approach. In our experiments over a total of 52 3D scenes of homes and offices (composed from about 550 views), we get a performance of 84.06% and 73.38% in labeling office and home scenes respectively for 17 object classes each. We also present a method for a robot to search for an object using the learned model and the contextual information available from the current labelings of the scene. We applied this algorithm successfully on a mobile robot for the task of finding 12 object classes in 10 different offices and achieved a precision of 97.56% with 78.43% recall.