Goto

Collaborating Authors

 Learning Graphical Models


Bayesian Biosurveillance of Disease Outbreaks

arXiv.org Artificial Intelligence

Early, reliable detection of disease outbreaks is a critical problem today. This paper reports an investigation of the use of causal Bayesian networks to model spatio-temporal patterns of a non-contagious disease (respiratory anthrax infection) in a population of people. The number of parameters in such a network can become enormous, if not carefully managed. Also, inference needs to be performed in real time as population data stream in. We describe techniques we have applied to address both the modeling and inference challenges. A key contribution of this paper is the explication of assumptions and techniques that are sufficient to allow the scaling of Bayesian network modeling and inference to millions of nodes for real-time surveillance applications. The results reported here provide a proof-of-concept that Bayesian networks can serve as the foundation of a system that effectively performs Bayesian biosurveillance of disease outbreaks.


Propositional and Relational Bayesian Networks Associated with Imprecise and Qualitative Probabilistic Assesments

arXiv.org Artificial Intelligence

This paper investigates a representation language with flexibility inspired by probabilistic logic and compactness inspired by relational Bayesian networks. The goal is to handle propositional and first-order constructs together with precise, imprecise, indeterminate and qualitative probabilistic assessments. The paper shows how this can be achieved through the theory of credal networks. New exact and approximate inference algorithms based on multilinear programming and iterated/loopy propagation of interval probabilities are presented; their superior performance, compared to existing ones, is shown empirically.


Mixtures of Deterministic-Probabilistic Networks and their AND/OR Search Space

arXiv.org Artificial Intelligence

The paper introduces mixed networks, a new framework for expressing and reasoning with probabilistic and deterministic information. The framework combines belief networks with constraint networks, defining the semantics and graphical representation. We also introduce the AND/OR search space for graphical models, and develop a new linear space search algorithm. This provides the basis for understanding the benefits of processing the constraint information separately, resulting in the pruning of the search space. When the constraint part is tractable or has a small number of solutions, using the mixed representation can be exponentially more effective than using pure belief networks which model constraints as conditional probability tables.


Region-Based Incremental Pruning for POMDPs

arXiv.org Artificial Intelligence

We present a major improvement to the incremental pruning algorithm for solving partially observable Markov decision processes. Our technique targets the cross-sum step of the dynamic programming (DP) update, a key source of complexity in POMDP algorithms. Instead of reasoning about the whole belief space when pruning the cross-sums, our algorithm divides the belief space into smaller regions and performs independent pruning in each region. We evaluate the benefits of the new technique both analytically and experimentally, and show that it produces very significant performance gains. The results contribute to the scalability of POMDP algorithms to domains that cannot be handled by the best existing techniques.


Dynamic Programming for Structured Continuous Markov Decision Problems

arXiv.org Artificial Intelligence

We describe an approach for exploiting structure in Markov Decision Processes with continuous state variables. At each step of the dynamic programming, the state space is dynamically partitioned into regions where the value function is the same throughout the region. We first describe the algorithm for piecewise constant representations. We then extend it to piecewise linear representations, using techniques from POMDPs to represent and reason about linear surfaces efficiently. We show that for complex, structured problems, our approach exploits the natural structure so that optimal solutions can be computed efficiently.


Metrics for Finite Markov Decision Processes

arXiv.org Artificial Intelligence

The formulation of our metrics is based on the notion of bisimulation for MDPs, with an aim towards solving discounted infinite horizon reinforcement learning tasks. Such metrics can be used to aggregate states, as well as to better structure other value function approximators (e.g., memory-based or nearest-neighbor approximators). We provide bounds that relate our metric distances to the optimal values of states in the given MDP.


Exploiting First-Order Regression in Inductive Policy Selection

arXiv.org Artificial Intelligence

We consider the problem of computing optimal generalised policies for relational Markov decision processes. We describe an approach combining some of the benefits of purely inductive techniques with those of symbolic dynamic programming methods. The latter reason about the optimal value function using first-order decision theoretic regression and formula rewriting, while the former, when provided with a suitable hypotheses language, are capable of generalising value functions or policies for small instances. Our idea is to use reasoning and in particular classical first-order regression to automatically generate a hypotheses language dedicated to the domain at hand, which is then used as input by an inductive solver. This approach avoids the more complex reasoning of symbolic dynamic programming while focusing the inductive solver's attention on concepts that are specifically relevant to the optimal value function for the domain considered.


Dual-Space Analysis of the Sparse Linear Model

arXiv.org Machine Learning

Sparse linear (or generalized linear) models combine a standard likelihood function with a sparse prior on the unknown coefficients. These priors can conveniently be expressed as a maximization over zero-mean Gaussians with different variance hyperparameters. Standard MAP estimation (Type I) involves maximizing over both the hyperparameters and coefficients, while an empirical Bayesian alternative (Type II) first marginalizes the coefficients and then maximizes over the hyperparameters, leading to a tractable posterior approximation. The underlying cost functions can be related via a dual-space framework from Wipf et al. (2011), which allows both the Type I or Type II objectives to be expressed in either coefficient or hyperparmeter space. This perspective is useful because some analyses or extensions are more conducive to development in one space or the other. Herein we consider the estimation of a trade-off parameter balancing sparsity and data fit. As this parameter is effectively a variance, natural estimators exist by assessing the problem in hyperparameter (variance) space, transitioning natural ideas from Type II to solve what is much less intuitive for Type I. In contrast, for analyses of update rules and sparsity properties of local and global solutions, as well as extensions to more general likelihood models, we can leverage coefficient-space techniques developed for Type I and apply them to Type II. For example, this allows us to prove that Type II-inspired techniques can be successful recovering sparse coefficients when unfavorable restricted isometry properties (RIP) lead to failure of popular L1 reconstructions. It also facilitates the analysis of Type II when non-Gaussian likelihood models lead to intractable integrations.


A Spectral Learning Approach to Range-Only SLAM

arXiv.org Machine Learning

We present a novel spectral learning algorithm for simultaneous localization and mapping (SLAM) from range data with known correspondences. This algorithm is an instance of a general spectral system identification framework, from which it inherits several desirable properties, including statistical consistency and no local optima. Compared with popular batch optimization or multiple-hypothesis tracking (MHT) methods for range-only SLAM, our spectral approach offers guaranteed low computational requirements and good tracking performance. Compared with popular extended Kalman filter (EKF) or extended information filter (EIF) approaches, and many MHT ones, our approach does not need to linearize a transition or measurement model; such linearizations can cause severe errors in EKFs and EIFs, and to a lesser extent MHT, particularly for the highly non-Gaussian posteriors encountered in range-only SLAM. We provide a theoretical analysis of our method, including finite-sample error bounds. Finally, we demonstrate on a real-world robotic SLAM problem that our algorithm is not only theoretically justified, but works well in practice: in a comparison of multiple methods, the lowest errors come from a combination of our algorithm with batch optimization, but our method alone produces nearly as good a result at far lower computational cost. 1 Introduction In range-only SLAM, we are given a sequence of range measurements from a robot to fixed landmarks, and possibly a matching sequence of odometry measurements. We then attempt to simultaneously estimate the robot's trajectory and the locations of the landmarks. In all the above approaches, the most popular representation for a hypothesis is a list of landmark locations (m n,x,m n,y) and a list of robot poses (x t,y t,θ t) . Unfortunately, both the motion and measurement models are highly nonlinear in this representation, leading to computational problems: inaccurate linearizations in EKF/EIF/MHT and local optima in batch optimization approaches (see Section 2 for details). Much work has attempted to remedy this problem, e.g., by changing the hypothesis representation (Djugash, 2010) or by keeping multiple hypotheses (Djugash et al., 2005; Djugash, 2010; Thrun et al., 2005). While considerable progress has been made, none of these methods are ideal; common difficulties include the need for an extensive initialization phase, inability to recover from poor initialization, lack of performance guarantees, or excessive computational requirements. We take a very different approach: we formulate range-only SLAM as a matrix factorization problem, where features of observations are linearly related to a 4-or 7-dimensional state space.


Non-Convex Rank Minimization via an Empirical Bayesian Approach

arXiv.org Machine Learning

In many applications that require matrix solutions of minimal rank, the underlying cost function is non-convex leading to an intractable, NP-hard optimization problem. Consequently, the convex nuclear norm is frequently used as a surrogate penalty term for matrix rank. The problem is that in many practical scenarios there is no longer any guarantee that we can correctly estimate generative low-rank matrices of interest, theoretical special cases notwithstanding. Consequently, this paper proposes an alternative empirical Bayesian procedure build upon a variational approximation that, unlike the nuclear norm, retains the same globally minimizing point estimate as the rank function under many useful constraints. However, locally minimizing solutions are largely smoothed away via marginalization, allowing the algorithm to succeed when standard convex relaxations completely fail. While the proposed methodology is generally applicable to a wide range of low-rank applications, we focus our attention on the robust principal component analysis problem (RPCA), which involves estimating an unknown low-rank matrix with unknown sparse corruptions. Theoretical and empirical evidence are presented to show that our method is potentially superior to related MAP-based approaches, for which the convex principle component pursuit (PCP) algorithm (Candes et al., 2011) can be viewed as a special case.