Bayesian Non-Homogeneous Markov Models via Polya-Gamma Data Augmentation with Applications to Rainfall Modeling

arXiv.org Machine Learning

Discrete-time hidden Markov models are a broadly useful class of latent-variable models with applications in areas such as speech recognition, bioinformatics, and climate data analysis. It is common in practice to introduce temporal non-homogeneity into such models by making the transition probabilities dependent on time-varying exogenous input variables via a multinomial logistic parametrization. We extend such models to introduce additional non-homogeneity into the emission distribution using a generalized linear model (GLM), with data augmentation for sampling-based inference. However, the presence of the logistic function in the state transition model significantly complicates parameter inference for the overall model, particularly in a Bayesian context. To address this we extend the recently-proposed Polya-Gamma data augmentation approach to handle non-homogeneous hidden Markov models (NHMMs), allowing the development of an efficient Markov chain Monte Carlo (MCMC) sampling scheme. We apply our model and inference scheme to 30 years of daily rainfall in India, leading to a number of insights into rainfall-related phenomena in the region. Our proposed approach allows for fully Bayesian analysis of relatively complex NHMMs on a scale that was not possible with previous methods. Software implementing the methods described in the paper is available via the R package NHMM.


Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much

Neural Information Processing Systems

Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions. There are two common scan orders for the variables: random scan and systematic scan. Due to the benefits of locality in hardware, systematic scan is commonly used, even though most statistical guarantees are only for random scan. While it has been conjectured that the mixing times of random scan and systematic scan do not differ by more than a logarithmic factor, we show by counterexample that this is not the case, and we prove that that the mixing times do not differ by more than a polynomial factor under mild conditions. To prove these relative bounds, we introduce a method of augmenting the state space to study systematic scan using conductance.


Global Climate Model Tracking Using Geospatial Neighborhoods

AAAI Conferences

A key problem in climate science is how to combine the predictions of the multi-model ensemble of global climate models. Recent work in machine learning (Monteleoni et al. 2011) showed the promise of an algorithm for online learning with experts for this task.We extend the Tracking Climate Models (TCM) approach to (1) take into account climate model predictions at higher spatial resolutions and (2) to model geospatial neighborhood influence between regions. Our algorithm enables neighborhood influence by modifying the transition dynamics of the Hidden Markov Model used by TCM, allowing the performance of spatial neighbors to influence the temporal switching probabilities for the best expert (climate model) at a given location. In experiments on historical data at a variety of spatial resolutions, our algorithm demonstrates improvements over TCM, when tracking global temperature anomalies.


Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much

arXiv.org Machine Learning

Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions. There are two common scan orders for the variables: random scan and systematic scan. Due to the benefits of locality in hardware, systematic scan is commonly used, even though most statistical guarantees are only for random scan. While it has been conjectured that the mixing times of random scan and systematic scan do not differ by more than a logarithmic factor, we show by counterexample that this is not the case, and we prove that that the mixing times do not differ by more than a polynomial factor under mild conditions. To prove these relative bounds, we introduce a method of augmenting the state space to study systematic scan using conductance.


Authorial Idioms for Target Distributions in TTD-MDPs

AAAI Conferences

In designing Markov Decision Processes (MDP), one must define the world, its dynamics, a set of actions, and a reward function. MDPs are often applied in situations where there is a clear choice of reward functions and in these cases significant care must be taken to construct a reward function that induces the desired behavior. In this paper, we consider an analogous design problem: crafting a target distribution in Targeted Trajectory Distribution MDPs (TTD-MDPs). TTD-MDPs produce probabilistic policies that minimize divergence from a target distribution of trajectories from an underlying MDP. They are an extension of MDPs that provide variety of experience during repeated execution. Here, we present a brief overview of TTD-MDPs with approaches for constructing target distributions. Then we present a novel authorial idiom for creating target distributions using prototype trajectories. We evaluate these approaches on a drama manager for an interactive game.