Goto

Collaborating Authors

 markov model


Relational neurosymbolic Markov models

AIHub

Our most powerful artificial agents cannot be told exactly what to do, especially in complex planning environments. They almost exclusively rely on neural networks to perform their tasks, but neural networks cannot easily be told to obey certain rules or adhere to existing background knowledge. While such uncontrolled behaviour might be nothing more than a simple annoyance next time you ask an LLM to generate a schedule for reaching a deadline in two days and it starts to hallucinate that days have 48 hours instead of 24, it can be much more impactful when that same LLM is controlling an agent responsible for navigating a warehouse filled with TNT and it decides to go just a little too close to the storage compartments. Luckily, controlling neural networks has gained a lot of attention over the last years through the development of . Neurosymbolic AI, or NeSy for short, aims to combine the learning abilities of neural networks with the guarantees that symbolic methods based on automated mathematical reasoning offer.







Fast Gibbs Sampling on Bayesian Hidden Markov Model with Missing Observations

Li, Dongrong, Yu, Tianwei, Fan, Xiaodan

arXiv.org Machine Learning

The Hidden Markov Model (HMM) is a widely-used statistical model for handling sequential data. However, the presence of missing observations in real-world datasets often complicates the application of the model. The EM algorithm and Gibbs samplers can be used to estimate the model, yet suffering from various problems including non-convexity, high computational complexity and slow mixing. In this paper, we propose a collapsed Gibbs sampler that efficiently samples from HMMs' posterior by integrating out both the missing observations and the corresponding latent states. The proposed sampler is fast due to its three advantages. First, it achieves an estimation accuracy that is comparable to existing methods. Second, it can produce a larger Effective Sample Size (ESS) per iteration, which can be justified theoretically and numerically. Third, when the number of missing entries is large, the sampler has a significant smaller computational complexity per iteration compared to other methods, thus is faster computationally. In summary, the proposed sampling algorithm is fast both computationally and theoretically and is particularly advantageous when there are a lot of missing entries. Finally, empirical evaluations based on numerical simulations and real data analysis demonstrate that the proposed algorithm consistently outperforms existing algorithms in terms of time complexity and sampling efficiency (measured in ESS).


FlowHMM: Flow-based continuous hidden Markov models

Neural Information Processing Systems

Continuous hidden Markov models (HMMs) assume that observations are generated from a mixture of Gaussian densities, limiting their ability to model more complex distributions. In this work, we address this shortcoming and propose novel continuous HMM models, dubbed FlowHMMs, that enable learning general continuous observation densities without constraining them to follow a Gaussian distribution or their mixtures. To that end, we leverage deep flow-based architectures that model complex, non-Gaussian functions and propose two variants of training a FlowHMM model. The first one, based on gradient-based technique, can be applied directly to continuous multidimensional data, yet its application to larger data sequences remains computationally expensive. Therefore, we also present a second approach to training our FlowHMM that relies on the co-occurrence matrix of discretized observations and considers the joint distribution of pairs of co-observed values, hence rendering the training time independent of the training sequence length. As a result, we obtain a model that can be flexibly adapted to the characteristics and dimensionality of the data. We perform a variety of experiments in which we compare both training strategies with a baseline of Gaussian mixture models. We show, that in terms of quality of the recovered probability distribution, accuracy of prediction of hidden states, and likelihood of unseen data, our approach outperforms the standard Gaussian methods.


A comparison between initialization strategies for the infinite hidden Markov model

Cortese, Federico P., Rossini, Luca

arXiv.org Machine Learning

Infinite hidden Markov models provide a flexible framework for modelling time series with structural changes and complex dynamics, without requiring the number of latent states to be specified in advance. This flexibility is achieved through the hierarchical Dirichlet process prior, while efficient Bayesian inference is enabled by the beam sampler, which combines dynamic programming with slice sampling to truncate the infinite state space adaptively. Despite extensive methodological developments, the role of initialization in this framework has received limited attention. This study addresses this gap by systematically evaluating initialization strategies commonly used for finite hidden Markov models and assessing their suitability in the infinite setting. Results from both simulated and real datasets show that distance-based clustering initializations consistently outperform model-based and uniform alternatives, the latter being the most widely adopted in the existing literature.


On the Stochastic Stability of Deep Markov Models

Neural Information Processing Systems

This section proposes additional regularization methods for learning stable deep Markov models. The most direct approach is to include the stability conditions as extra penalties in the DMM loss function.