Goto

Collaborating Authors

 Learning Graphical Models


Asynchronous Anytime Sequential Monte Carlo

arXiv.org Machine Learning

We introduce a new sequential Monte Carlo algorithm we call the particle cascade . The particle cascade is an asynchronous, anytime alternative to traditional particle filtering algorithms. It uses no barrier synchronizations which leads to improved particle throughput and memory efficiency. It is an anytime algorithm in the sense that it can be run forever to emit an unbounded number of particles while keeping within a fixed memory budget. We prove that the particle cascade is an unbiased marginal likelihood estimator which means that it can be straightforwardly plugged into existing pseudomarginal methods.


A Compilation Target for Probabilistic Programming Languages

arXiv.org Artificial Intelligence

Forward inference techniques such as sequential Monte Carlo and particle Markov chain Monte Carlo for probabilistic programming can be implemented in any programming language by creative use of standardized operating system functionality including processes, forking, mutexes, and shared memory. Exploiting this we have defined, developed, and tested a probabilistic programming language intermediate representation language we call probabilistic C, which itself can be compiled to machine code by standard compilers and linked to operating system libraries yielding an efficient, scalable, portable probabilistic programming compilation target. This opens up a new hardware and systems research path for optimizing probabilistic programming systems.


Learning Probabilistic Programs

arXiv.org Artificial Intelligence

We develop a technique for generalising from data in which models are samplers represented as program text. We establish encouraging empirical results that suggest that Markov chain Monte Carlo probabilistic programming inference techniques coupled with higher-order probabilistic programming languages are now sufficiently powerful to enable successful inference of this kind in nontrivial domains. We also introduce a new notion of probabilistic program compilation and show how the same machinery might be used in the future to compile probabilistic programs for efficient reusable predictive inference.


Inferring latent structures via information inequalities

arXiv.org Machine Learning

One of the goals of probabilistic inference is to decide whether an empirically observed distribution is compatible with a candidate Bayesian network. However, Bayesian networks with hidden variables give rise to highly non-trivial constraints on the observed distribution. Here, we propose an information-theoretic approach, based on the insight that conditions on entropies of Bayesian networks take the form of simple linear inequalities. We describe an algorithm for deriving entropic tests for latent structures. The well-known conditional independence tests appear as a special case. While the approach applies for generic Bayesian networks, we presently adopt the causal view, and show the versatility of the framework by treating several relevant problems from that domain: detecting common ancestors, quantifying the strength of causal influence, and inferring the direction of causation from two-variable marginals.


DimmWitted: A Study of Main-Memory Statistical Analytics

arXiv.org Machine Learning

We perform the first study of the tradeoff space of access methods and replication to support statistical analytics using first-order methods executed in the main memory of a Non-Uniform Memory Access (NUMA) machine. Statistical analytics systems differ from conventional SQL-analytics in the amount and types of memory incoherence they can tolerate. Our goal is to understand tradeoffs in accessing the data in row- or column-order and at what granularity one should share the model and data for a statistical task. We study this new tradeoff space, and discover there are tradeoffs between hardware and statistical efficiency. We argue that our tradeoff study may provide valuable information for designers of analytics engines: for each system we consider, our prototype engine can run at least one popular task at least 100x faster. We conduct our study across five architectures using popular models including SVMs, logistic regression, Gibbs sampling, and neural networks.


Reconstructing Velocities of Migrating Birds from Weather Radar – A Case Study in Computational Sustainability

AI Magazine

Each volume scan consists radial velocity data. For any given pulse volume, radial of a sequence of sweeps during which the antenna velocity tells us the component of target velocity in rotates 360 degrees around a vertical axis while the direction of the radar beam, and we have no additional keeping its elevation angle fixed (figure 2). The result information about the component orthogonal of each sweep is a set of raster data products summarizing to the radar beam. However, the overall pattern of the the radar signal returned from targets within sweep often provides clear evidence about the true discrete pulse volumes, which are the portions of the target velocities. In this example, targets to the northeast atmosphere sensed at a particular antenna position (NE) of the radar station have negative radial and range from the radar. The coordinates of each velocities (dark colors), which means they are pulse volume (r, ϕ, ρ) are measured in a three-dimensional approaching the radar, and targets to the southwest polar coordinate system: r is the distance in (SW) of the radar station have positive radial velocities meters from the antenna, ϕ is the azimuth, which is (light colors), which means they are departing the angle in the horizontal plane between the antenna direction and a fixed reference direction (typically the radar station. We can infer that the targets (in this degrees clockwise from due north), and ρ is the elevation case, predominantly migrating birds) are moving uniformly angle, which is the angle between the antenna in a SW direction, as shown in panel (c). The direction and its projection onto the horizontal spiral pattern in the velocity image is due to changes plane.


Sequential Decision Making in Computational Sustainability via Adaptive Submodularity

AI Magazine

Many problems in computational sustainability require making a sequence of decisions in complex, uncertain environments. Such problems are generally notoriously difficult. In this article, we review the recently discovered notion of adaptive submodularity, an intuitive diminishing returns condition that generalizes the classical notion of submodular set functions to sequential decision problems. Problems exhibiting the adaptive submodularity property can be efficiently and provably near-optimally solved using simple myopic policies. We illustrate this concept in several case studies of interest in computational sustainability: First, we demonstrate how it can be used to efficiently plan for resolving uncertainty in adaptive management scenarios. Secondly, we show how it applies to dynamic conservation planning for protecting endangered species, a case study carried out in collaboration with the US Geological Survey and the US Fish and Wildlife Service.


Nonparametric Hierarchical Clustering of Functional Data

arXiv.org Machine Learning

In this paper, we deal with the problem of curves clustering. We propose a nonparametric method which partitions the curves into clusters and discretizes the dimensions of the curve points into intervals. The cross-product of these partitions forms a data-grid which is obtained using a Bayesian model selection approach while making no assumptions regarding the curves. Finally, a post-processing technique, aiming at reducing the number of clusters in order to improve the interpretability of the clustering, is proposed. It consists in optimally merging the clusters step by step, which corresponds to an agglomerative hierarchical classification whose dissimilarity measure is the variation of the criterion. Interestingly this measure is none other than the sum of the Kullback-Leibler divergences between clusters distributions before and after the merges. The practical interest of the approach for functional data exploratory analysis is presented and compared with an alternative approach on an artificial and a real world data set.


Relational Logistic Regression

AAAI Conferences

Logistic regression is a commonly used representation for aggregators in Bayesian belief networks when a child has multiple parents. In this paper we consider extending logistic regression to relational models, where we want to model varying populations and interactions among parents. In this paper, we first examine the representational problems caused by population variation. We show how these problems arise even in simple cases with a single parametrized parent, and propose a linear relational logistic regression which we show can represent arbitrary linear (in population size) decision thresholds, whereas the traditional logistic regression cannot. Then we examine representing interactions among the parents of a child node, and representing non-linear dependency on population size. We propose a multi-parent relational logistic regression which can represent interactions among parents and arbitrary polynomial decision thresholds. Finally, we show how other well-known aggregators can be represented using this relational logistic regression.


Infinite Structured Hidden Semi-Markov Models

arXiv.org Machine Learning

This paper reviews recent advances in Bayesian nonparametric techniques for constructing and performing inference in infinite hidden Markov models. We focus on variants of Bayesian nonparametric hidden Markov models that enhance a posteriori state-persistence in particular. This paper also introduces a new Bayesian nonparametric framework for generating left-to- right and other structured, explicit-duration infinite hidden Markov models that we call the infinite structured hidden semi-Markov model .