Bayesian Non-Homogeneous Markov Models via Polya-Gamma Data Augmentation with Applications to Rainfall Modeling

arXiv.org Machine Learning

Discrete-time hidden Markov models are a broadly useful class of latent-variable models with applications in areas such as speech recognition, bioinformatics, and climate data analysis. It is common in practice to introduce temporal non-homogeneity into such models by making the transition probabilities dependent on time-varying exogenous input variables via a multinomial logistic parametrization. We extend such models to introduce additional non-homogeneity into the emission distribution using a generalized linear model (GLM), with data augmentation for sampling-based inference. However, the presence of the logistic function in the state transition model significantly complicates parameter inference for the overall model, particularly in a Bayesian context. To address this we extend the recently-proposed Polya-Gamma data augmentation approach to handle non-homogeneous hidden Markov models (NHMMs), allowing the development of an efficient Markov chain Monte Carlo (MCMC) sampling scheme. We apply our model and inference scheme to 30 years of daily rainfall in India, leading to a number of insights into rainfall-related phenomena in the region. Our proposed approach allows for fully Bayesian analysis of relatively complex NHMMs on a scale that was not possible with previous methods. Software implementing the methods described in the paper is available via the R package NHMM.


DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification

Neural Information Processing Systems

Probabilistic topic models have become popular as methods for dimensionality reduction in collections of text documents or images. These models are usually treated as generative models and trained using maximum likelihood or Bayesian methods. In this paper, we discuss an alternative: a discriminative framework in which we assume that supervised side information is present, and in which we wish to take that side information into account in finding a reduced dimensionality representation.Specifically, we present DiscLDA, a discriminative variation on Latent Dirichlet Allocation (LDA) in which a class-dependent linear transformation isintroduced on the topic mixture proportions. This parameter is estimated by maximizing the conditional likelihood. By using the transformed topic mixture proportionsas a new representation of documents, we obtain a supervised dimensionality reduction algorithm that uncovers the latent structure in a document collectionwhile preserving predictive power for the task of classification. We compare the predictive power of the latent structure of DiscLDA with unsupervised LDAon the 20 Newsgroups document classification task and show how our model can identify shared topics across classes as well as class-dependent topics.


Learning Finite-State Controllers for Partially Observable Environments

arXiv.org Artificial Intelligence

Reactive (memoryless) policies are sufficient in completely observable Markov decision processes (MDPs), but some kind of memory is usually necessary for optimal control of a partially observable MDP. Policies with finite memory can be represented as finite-state automata. In this paper, we extend Baird and Moore's VAPS algorithm to the problem of learning general finite-state automata. Because it performs stochastic gradient descent, this algorithm can be shown to converge to a locally optimal finite-state controller. We provide the details of the algorithm and then consider the question of under what conditions stochastic gradient descent will outperform exact gradient descent. We conclude with empirical results comparing the performance of stochastic and exact gradient descent, and showing the ability of our algorithm to extract the useful information contained in the sequence of past observations to compensate for the lack of observability at each time-step.


'Explainable Artificial Intelligence': Cracking open the black box of AI

#artificialintelligence

At a demonstration of Amazon Web Services' new artificial intelligence image recognition tool last week, the deep learning analysis calculated with near certainty that a photo of speaker Glenn Gore depicted a potted plant. "It is very clever, it can do some amazing things but it needs a lot of hand holding still. AI is almost like a toddler. They can do some pretty cool things, sometimes they can cause a fair bit of trouble," said AWS' chief architect in his day two keynote at the company's summit in Sydney. Where the toddler analogy falls short, however, is that a parent can make a reasonable guess as to, say, what led to their child drawing all over the walls, and ask them why.


'Explainable Artificial Intelligence': Cracking open the black box of AI

#artificialintelligence

At a demonstration of Amazon Web Services' new artificial intelligence image recognition tool last week, the deep learning analysis calculated with near certainty that a photo of speaker Glenn Gore depicted a potted plant. "It is very clever, it can do some amazing things but it needs a lot of hand holding still. AI is almost like a toddler. They can do some pretty cool things, sometimes they can cause a fair bit of trouble," said AWS' chief architect in his day two keynote at the company's summit in Sydney. Where the toddler analogy falls short, however, is that a parent can make a reasonable guess as to, say, what led to their child drawing all over the walls, and ask them why.