Goto

Collaborating Authors

 Markov Models


High-dimensional structure learning of binary pairwise Markov networks: A comparative numerical study

arXiv.org Machine Learning

Learning the undirected graph structure of a Markov network from data is a problem that has received a lot of attention during the last few decades. As a result of the general applicability of the model class, a myriad of methods have been developed in parallel in several research fields. Recently, as the size of the considered systems has increased, the focus of new methods has been shifted towards the high-dimensional domain. In particular, the introduction of the pseudo-likelihood function has pushed the limits of score-based methods originally based on the likelihood. At the same time, an array of methods based on simple pairwise tests have been developed to meet the challenges set by the increasingly large data sets in computational biology. Apart from being applicable on high-dimensional problems, methods based on the pseudo-likelihood and pairwise tests are fundamentally very different. In this work, we perform an extensive numerical study comparing the different types of methods on data generated by binary pairwise Markov networks. For sampling large networks, we use a parallelizable Gibbs sampler based on sparse restricted Boltzmann machines. Our results show that pairwise methods can be more accurate than pseudo-likelihood methods in settings often encountered in high-dimensional structure learning.


A Fully Bayesian Infinite Generative Model for Dynamic Texture Segmentation

arXiv.org Machine Learning

Generative dynamic texture models (GDTMs) are widely used for dynamic texture (DT) segmentation in the video sequences. GDTMs represent DTs as a set of linear dynamical systems (LDSs). A major limitation of these models concerns the automatic selection of a proper number of DTs. Dirichlet process mixture (DPM) models which have appeared recently as the cornerstone of the non-parametric Bayesian statistics, is an optimistic candidate toward resolving this issue. Under this motivation to resolve the aforementioned drawback, we propose a novel non-parametric fully Bayesian approach for DT segmentation, formulated on the basis of a joint DPM and GDTM construction. This interaction causes the algorithm to overcome the problem of automatic segmentation properly. We derive the Variational Bayesian Expectation-Maximization (VBEM) inference for the proposed model. Moreover, in the E-step of inference, we apply Rauch-Tung-Striebel smoother (RTSS) algorithm on Variational Bayesian LDSs. Ultimately, experiments on different video sequences are performed. Experiment results indicate that the proposed algorithm outperforms the previous methods in efficiency and accuracy noticeably.


Deep Learning meets Physics: Restricted Boltzmann Machines Part I

#artificialintelligence

In my opinion RBMs have one of the easiest architectures of all neural networks. As it can be seen in Fig.1. The absence of an output layer is apparent. But as it can be seen later an output layer wont be needed since the predictions are made differently as in regular feedforward neural networks. Energy is a term that may not be associated with deep learning in the first place.


Improving Coordination in Multi-Agent Deep Reinforcement Learning through Memory-driven Communication

arXiv.org Machine Learning

Deep reinforcement learning algorithms have recently been used to train multiple interacting agents in a centralised manner whilst keeping their execution decentralised. When the agents can only acquire partial observations and are faced with a task requiring coordination and synchronisation skills, inter-agent communication plays an essential role. In this work, we propose a framework for multi-agent training using deep deterministic policy gradients that enables the concurrent, end-to-end learning of an explicit communication protocol through a memory device. During training, the agents learn to perform read and write operations enabling them to infer a shared representation of the world. We empirically demonstrate that concurrent learning of the communication device and individual policies can improve inter-agent coordination and performance, and illustrate how different communication patterns can emerge for different tasks.


Prototypical Metric Transfer Learning for Continuous Speech Keyword Spotting With Limited Training Data

arXiv.org Machine Learning

Continuous Speech Keyword Spotting (CSKS) is the problem of spotting keywords in recorded conversations, when a small number of instances of keywords are available in training data. Unlike the more common Keyword Spotting, where an algorithm needs to detect lone keywords or short phrases like "Alexa", "Cortana", "Hi Alexa!", "Whatsup Octavia?" etc. in speech, CSKS needs to filter out embedded words from a continuous flow of speech, ie. spot "Anna" and "github" in "I know a developer named Anna who can look into this github issue." Apart from the issue of limited training data availability, CSKS is an extremely imbalanced classification problem. We address the limitations of simple keyword spotting baselines for both aforementioned challenges by using a novel combination of loss functions (Prototypical networks' loss and metric loss) and transfer learning. Our method improves F1 score by over 10%.


Deep Generative Markov State Models

arXiv.org Machine Learning

We propose a deep generative Markov State Model (DeepGenMSM) learning framework for inference of metastable dynamical systems and prediction of trajectories. After unsupervised training on time series data, the model contains (i) a probabilistic encoder that maps from high-dimensional configuration space to a small-sized vector indicating the membership to metastable (long-lived) states, (ii) a Markov chain that governs the transitions between metastable states and facilitates analysis of the long-time dynamics, and (iii) a generative part that samples the conditional distribution of configurations in the next time step. The model can be operated in a recursive fashion to generate trajectories to predict the system evolution from a defined starting state and propose new configurations. The DeepGenMSM is demonstrated to provide accurate estimates of the long-time kinetics and generate valid distributions for molecular dynamics (MD) benchmark systems. Remarkably, we show that DeepGenMSMs are able to make long time-steps in molecular configuration space and generate physically realistic structures in regions that were not seen in training data.


Life is Random, Time is Not: Markov Decision Processes with Window Objectives

arXiv.org Artificial Intelligence

The window mechanism was introduced by Chatterjee et al. [17] to strengthen classical game objectives with time bounds. It permits to synthesize system controllers that exhibit acceptable behaviors within a configurable time frame, all along their infinite execution, in contrast to the traditional objectives that only require correctness of behaviors in the limit. The window concept has proved its interest in a variety of two-player zero-sum games, thanks to the ability to reason about such time bounds in system specifications, but also the increased tractability that it usually yields. In this work, we extend the window framework to stochastic environments by considering the fundamental threshold probability problem in Markov decision processes for window objectives. That is, given such an objective, we want to synthesize strategies that guarantee satisfying runs with a given probability. We solve this problem for the usual variants of window objectives, where either the time frame is set as a parameter, or we ask if such a time frame exists. We develop a generic approach for window-based objectives and instantiate it for the classical mean-payoff and parity objectives, already considered in games. Our work paves the way to a wide use of the window mechanism in stochastic models.


Learning Undirected Posteriors by Backpropagation through MCMC Updates

arXiv.org Machine Learning

The representation of the posterior is a critical aspect of effective variational autoencoders (VAEs). Poor choices for the posterior have a detrimental impact on the generative performance of VAEs due to the mismatch with the true posterior. We extend the class of posterior models that may be learned by using undirected graphical models. We develop an efficient method to train undirected posteriors by showing that the gradient of the training objective with respect to the parameters of the undirected posterior can be computed by backpropagation through Markov chain Monte Carlo updates. We apply these gradient estimators for training discrete VAEs with Boltzmann machine posteriors and demonstrate that undirected models outperform previous results obtained using directed graphical models as posteriors.


Deep Learning for Human Affect Recognition: Insights and New Developments

arXiv.org Machine Learning

Automatic human affect recognition is a key step towards more natural human-computer interaction. Recent trends include recognition in the wild using a fusion of audiovisual and physiological sensors, a challenging setting for conventional machine learning algorithms. Since 2010, novel deep learning algorithms have been applied increasingly in this field. In this paper, we review the literature on human affect recognition between 2010 and 2017, with a special focus on approaches using deep neural networks. By classifying a total of 950 studies according to their usage of shallow or deep architectures, we are able to show a trend towards deep learning. Reviewing a subset of 233 studies that employ deep neural networks, we comprehensively quantify their applications in this field. We find that deep learning is used for learning of (i) spatial feature representations, (ii) temporal feature representations, and (iii) joint feature representations for multimodal sensor data. Exemplary state-of-the-art architectures illustrate the progress. Our findings show the role deep architectures will play in human affect recognition, and can serve as a reference point for researchers working on related applications.


Generating Haiku with Deep Learning – Towards Data Science

#artificialintelligence

I've done previous work on haiku generation. This generator uses Markov chains trained on a corpus of non-haiku poetry, generates haiku one word at a time, and ensures the 5-7-5 structure by backspacing when all the possible next words would violate the 5–7–5 structure. This isn't unlike what I do when I'm writing a haiku. I try things, count out the syllables, find they don't work and go back. It feels more like brute force than something that actually understands what it means to write a haiku.