Goto

Collaborating Authors

 viterbi path


Regenerativity of Viterbi process for pairwise Markov models

arXiv.org Machine Learning

For hidden Markov models one of the most popular estimates of the hidden chain is the Viterbi path -- the path maximising the posterior probability. We consider a more general setting, called the pairwise Markov model (PMM), where the joint process consisting of finite-state hidden process and observation process is assumed to be a Markov chain. It has been recently proven that under some conditions the Viterbi path of the PMM can almost surely be extended to infinity, thereby defining the infinite Viterbi decoding of the observation sequence, called the Viterbi process. This was done by constructing a block of observations, called a barrier, which ensures that the Viterbi path goes trough a given state whenever this block occurs in the observation sequence. In this paper we prove that the joint process consisting of Viterbi process and PMM is regenerative. The proof involves a delicate construction of regeneration times which coincide with the occurrences of barriers. As one possible application of our theory, some results on the asymptotics of the Viterbi training algorithm are derived.


Autoregressive Asymmetric Linear Gaussian Hidden Markov Models

arXiv.org Machine Learning

In a real life process evolving over time, the relationship between its relevant variables may change. Therefore, it is advantageous to have different inference models for each state of the process. Asymmetric hidden Markov models fulfil this dynamical requirement and provide a framework where the trend of the process can be expressed as a latent variable. In this paper, we modify these recent asymmetric hidden Markov models to have an asymmetric autoregressive component, allowing the model to choose the order of autoregression that maximizes its penalized likelihood for a given training set. Additionally, we show how inference, hidden states decoding and parameter learning must be adapted to fit the proposed model. Finally, we run experiments with synthetic and real data to show the capabilities of this new model.


Parkinson's Disease Digital Biomarker Discovery with Optimized Transitions and Inferred Markov Emissions

arXiv.org Machine Learning

We search for digital biomarkers from Parkinson's Disease by observing approximate repetitive patterns matching hypothesized step and stride periodic cycles. These observations were modeled as a cycle of hidden states with randomness allowing deviation from a canonical pattern of transitions and emissions, under the hypothesis that the averaged features of hidden states would serve to informatively characterize classes of patients/controls. We propose a Hidden Semi-Markov Model (HSMM), a latent-state model, emitting 3D-acceleration vectors. Transitions and emissions are inferred from data. We fit separate models per unique device and training label. Hidden Markov Models (HMM) force geometric distributions of the duration spent at each state before transition to a new state. Instead, our HSMM allows us to specify the distribution of state duration. This modified version is more effective because we are interested more in each state's duration than the sequence of distinct states, allowing inclusion of these durations the feature vector.


A Bounded $p$-norm Approximation of Max-Convolution for Sub-Quadratic Bayesian Inference on Additive Factors

arXiv.org Machine Learning

Max-convolution is an important problem closely resembling standard convolution; as such, max-convolution occurs frequently across many fields. Here we extend the method with fastest known worst-case runtime, which can be applied to nonnegative vectors by numerically approximating the Chebyshev norm $\| \cdot \|_\infty$, and use this approach to derive two numerically stable methods based on the idea of computing $p$-norms via fast convolution: The first method proposed, with runtime in $O( k \log(k) \log(\log(k)) )$ (which is less than $18 k \log(k)$ for any vectors that can be practically realized), uses the $p$-norm as a direct approximation of the Chebyshev norm. The second approach proposed, with runtime in $O( k \log(k) )$ (although in practice both perform similarly), uses a novel null space projection method, which extracts information from a sequence of $p$-norms to estimate the maximum value in the vector (this is equivalent to querying a small number of moments from a distribution of bounded support in order to estimate the maximum). The $p$-norm approaches are compared to one another and are shown to compute an approximation of the Viterbi path in a hidden Markov model where the transition matrix is a Toeplitz matrix; the runtime of approximating the Viterbi path is thus reduced from $O( n k^2 )$ steps to $O( n $k \log(k))$ steps in practice, and is demonstrated by inferring the U.S. unemployment rate from the S&P 500 stock index.


A generalized risk approach to path inference based on hidden Markov models

arXiv.org Machine Learning

Motivated by the unceasing interest in hidden Markov models (HMMs), this paper re-examines hidden path inference in these models, using primarily a risk-based framework. While the most common maximum a posteriori (MAP), or Viterbi, path estimator and the minimum error, or Posterior Decoder (PD), have long been around, other path estimators, or decoders, have been either only hinted at or applied more recently and in dedicated applications generally unfamiliar to the statistical learning community. Over a decade ago, however, a family of algorithmically defined decoders aiming to hybridize the two standard ones was proposed (Brushe et al., 1998). The present paper gives a careful analysis of this hybridization approach, identifies several problems and issues with it and other previously proposed approaches, and proposes practical resolutions of those. Furthermore, simple modifications of the classical criteria for hidden path recognition are shown to lead to a new class of decoders. Dynamic programming algorithms to compute these decoders in the usual forward-backward manner are presented. A particularly interesting subclass of such estimators can be also viewed as hybrids of the MAP and PD estimators. Similar to previously proposed MAP-PD hybrids, the new class is parameterized by a small number of tunable parameters. Unlike their algorithmic predecessors, the new risk-based decoders are more clearly interpretable, and, most importantly, work "out of the box" in practice, which is demonstrated on some real bioinformatics tasks and data. Some further generalizations and applications are discussed in conclusion.