Undirected Networks
Learning Hidden Quantum Markov Models
Srinivasan, Siddarth, Gordon, Geoff, Boots, Byron
Hidden Quantum Markov Models (HQMMs) can be thought of as quantum probabilistic graphical models that can model sequential data. We extend previous work on HQMMs with three contributions: (1) we show how classical hidden Markov models (HMMs) can be simulated on a quantum circuit, (2) we reformulate HQMMs by relaxing the constraints for modeling HMMs on quantum circuits, and (3) we present a learning algorithm to estimate the parameters of an HQMM from data. While our algorithm requires further optimization to handle larger datasets, we are able to evaluate our algorithm using several synthetic datasets. We show that on HQMM generated data, our algorithm learns HQMMs with the same number of hidden states and predictive accuracy as the true HQMMs, while HMMs learned with the Baum-Welch algorithm require more states to match the predictive accuracy.
Mastering Machine Learning Algorithms PACKT Books
Machine learning is a subset of AI which aims at making modern-day computers smarter, and intelligent. The real power of machine learning resides in its algorithms which make even the most difficult things possible for machines to handle. However, with the advancement in the technology and requirement of data, our machines will have to be smarter than they are today to meet the overwhelming needs, and knowing how different algorithms work to make ML effective is the need of the hour. Mastering Machine Learning Algorithms is your tool, your guide to get to grips quickly with the most widely used algorithms. You start with learning various ML models along with an introduction to semi-supervised learning.
Mastering Machine Learning with scikit-learn PACKT Books
This book examines machine learning models including logistic regression, decision trees, and support vector machines, and applies them to common problems such as categorizing documents and classifying images. It begins with the fundamentals of machine learning, introducing you to the supervised-unsupervised spectrum, the uses of training and test data, and evaluating models. You will learn how to use generalized linear models in regression problems, as well as solve problems with text and categorical features. You will be acquainted with the use of logistic regression, regularization, and the various loss functions that are used by generalized linear models. The book will also walk you through an example project that prompts you to label the most uncertain training examples.
Linear-Time Algorithm in Bayesian Image Denoising based on Gaussian Markov Random Field
Yasuda, Muneki, Watanabe, Junpei, Kataoka, Shun, Tanaka, kazuyuki
Bayesian image processing [1] based on a probabilistic graphical model has a long and rich history [2]. In Bayesian image processing, one constructs a posterior distribution and then infers restored images based on the posterior distribution. The posterior distribution is derived from a prior distribution that captures the statistical properties of the images. One of the major challenges of Bayesian image processing is the construction of an effective prior for the images. For this purpose, a Gaussian Markov random field (GMRF) model (or Gaussian graphical model) is a possible choice.
Getting Started with Particle Metropolis-Hastings for Inference in Nonlinear Dynamical Models
Dahlin, Johan, Schรถn, Thomas B.
This tutorial provides a gentle introduction to the particle Metropolis-Hastings (PMH) algorithm for parameter inference in nonlinear state-space models together with a software implementation in the statistical programming language R. We employ a step-by-step approach to develop an implementation of the PMH algorithm (and the particle filter within) together with the reader. This final implementation is also available as the package pmhtutorial in the CRAN repository. Throughout the tutorial, we provide some intuition as to how the algorithm operates and discuss some solutions to problems that might occur in practice. To illustrate the use of PMH, we consider parameter inference in a linear Gaussian state-space model with synthetic data and a nonlinear stochastic volatility model with real-world data.
Estimate exponential memory decay in Hidden Markov Model and its applications
Ye, Felix X. -F., Ma, Yi-an, Qian, Hong
Inference in hidden Markov model has been challenging in terms of scalability due to dependencies in the observation data. In this paper, we utilize the inherent memory decay in hidden Markov models, such that the forward and backward probabilities can be carried out with subsequences, enabling efficient inference over long sequences of observations. We formulate this forward filtering process in the setting of the random dynamical system and there exist Lyapunov exponents in the i.i.d random matrices production. And the rate of the memory decay is known as $\lambda_2-\lambda_1$, the gap of the top two Lyapunov exponents almost surely. An efficient and accurate algorithm is proposed to numerically estimate the gap after the soft-max parametrization. The length of subsequences $B$ given the controlled error $\epsilon$ is $B=\log(\epsilon)/(\lambda_2-\lambda_1)$. We theoretically prove the validity of the algorithm and demonstrate the effectiveness with numerical examples. The method developed here can be applied to widely used algorithms, such as mini-batch stochastic gradient method. Moreover, the continuity of Lyapunov spectrum ensures the estimated $B$ could be reused for the nearby parameter during the inference.
Beyond similarity assessment: Selecting the optimal model for sequence alignment via the Factorized Asymptotic Bayesian algorithm
Takeda, Taikai, Hamada, Michiaki
Pair Hidden Markov Models (PHMMs) are probabilistic models used for pairwise sequence alignment, a quintessential problem in bioinformatics. PHMMs include three types of hidden states: match, insertion and deletion. Most previous studies have used one or two hidden states for each PHMM state type. However, few studies have examined the number of states suitable for representing sequence data or improving alignment accuracy.We developed a novel method to select superior models (including the number of hidden states) for PHMM. Our method selects models with the highest posterior probability using Factorized Information Criteria (FIC), which is widely utilised in model selection for probabilistic models with hidden variables. Our simulations indicated this method has excellent model selection capabilities with slightly improved alignment accuracy. We applied our method to DNA datasets from 5 and 28 species, ultimately selecting more complex models than those used in previous studies.
On better training the infinite restricted Boltzmann machines
Peng, Xuan, Gao, Xunzhang, Li, Xiang
The infinite restricted Boltzmann machine (iRBM) is an extension of the classic RBM. It enjoys a good property of automatically deciding the size of the hidden layer according to specific training data. With sufficient training, the iRBM can achieve a competitive performance with that of the classic RBM. However, the convergence of learning the iRBM is slow, due to the fact that the iRBM is sensitive to the ordering of its hidden units, the learned filters change slowly from the left-most hidden unit to right. To break this dependency between neighboring hidden units and speed up the convergence of training, a novel training strategy is proposed. The key idea of the proposed training strategy is randomly regrouping the hidden units before each gradient descent step. Potentially, a mixing of infinite many iRBMs with different permutations of the hidden units can be achieved by this learning method, which has a similar effect of preventing the model from over-fitting as the dropout. The original iRBM is also modified to be capable of carrying out discriminative training. To evaluate the impact of our method on convergence speed of learning and the model's generalization ability, several experiments have been performed on the binarized MNIST and CalTech101 Silhouettes datasets. Experimental results indicate that the proposed training strategy can greatly accelerate learning and enhance generalization ability of iRBMs.
Sparse Markov Decision Processes with Causal Sparse Tsallis Entropy Regularization for Reinforcement Learning
Lee, Kyungjae, Choi, Sungjoon, Oh, Songhwai
Arkov decision processes (MDPs) have been widely used as a mathematical framework to solve stochastic sequential decision problems, such as autonomous driving [1], path planning [2], and quadrotor control [3]. In general, the goal of an MDP is to find the optimal policy function which maximizes the expected return. The expected return is a performance measure of a policy function and it is often defined as the expected sum of discounted rewards. An MDP is often used to formulate reinforcement learning (RL) [4], which aims to find the optimal policy without the explicit specification of stochasticity of an environment, and inverse reinforcement learning (IRL) [5], whose goal is to search the proper reward function that can explain the behavior of an expert who follows the underlying optimal policy. While the optimal solution of an MDP is a deterministic policy, it is not desirable to apply an MDP to the problems with multiple optimal actions. In perspective of RL, the knowledge of multiple optimal actions makes it possible to cope with unexpected situations. For example, suppose that an autonomous vehicle has multiple optimal routes to reach a given goal. If a traffic accident occurs at the currently selected optimal route, it is possible to avoid the accident by choosing another safe optimal route without additional computation of a new optimal route.
A Dynamic Edge Exchangeable Model for Sparse Temporal Networks
We propose a dynamic edge exchangeable network model that can capture sparse connections observed in real temporal networks, in contrast to existing models which are dense. The model achieved superior link prediction accuracy on multiple data sets when compared to a dynamic variant of the blockmodel, and is able to extract interpretable time-varying community structures from the data. In addition to sparsity, the model accounts for the effect of social influence on vertices' future behaviours. Compared to the dynamic blockmodels, our model has a smaller latent space. The compact latent space requires a smaller number of parameters to be estimated in variational inference and results in a computationally friendly inference algorithm.