lgssm
Reviews: A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning
The paper presents a time-series model for high dimensional data by combining variational auto-encoder (VAE) with linear Gaussian state space model (LGSSM). The proposed model takes the latent repressentation from VAE as the output of LGSSM. The exact inference of linear Gaussian state space model via Kalman smoothing enables efficient and accurate variational inference for the overall model. To extend the temporal dynamics beyond linear dependency, the authors use a LSTM to parameterize the matrices in LGSSM. The performance of the proposed model is evaluated through bouncing ball and Pendulum experiments.
A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning
Marco Fraccaro, Simon Kamronn, Ulrich Paquet, Ole Winther
This paper takes a step towards temporal reasoning in a dynamically changing video, not in the pixel space that constitutes its frames, but in a latent space that describes the non-linear dynamics of the objects in its world. We introduce the Kalman variational auto-encoder, a framework for unsupervised learning of sequential data that disentangles two latent representations: an object's representation, coming from a recognition model, and a latent state describing its dynamics. As a result, the evolution of the world can be imagined and missing data imputed, both without the need to generate high dimensional frames at each time step. The model is trained end-to-end on videos of a variety of simulated physical systems, and outperforms competing methods in generative and missing data imputation tasks.
Enhanced SMC$^2$: Leveraging Gradient Information from Differentiable Particle Filters Within Langevin Proposals
Rosato, Conor, Murphy, Joshua, Varsi, Alessandro, Horridge, Paul, Maskell, Simon
Sequential Monte Carlo Squared (SMC$^2$) is a Bayesian method which can infer the states and parameters of non-linear, non-Gaussian state-space models. The standard random-walk proposal in SMC$^2$ faces challenges, particularly with high-dimensional parameter spaces. This study outlines a novel approach by harnessing first-order gradients derived from a Common Random Numbers - Particle Filter (CRN-PF) using PyTorch. The resulting gradients can be leveraged within a Langevin proposal without accept/reject. Including Langevin dynamics within the proposal can result in a higher effective sample size and more accurate parameter estimates when compared with the random-walk. The resulting algorithm is parallelized on distributed memory using Message Passing Interface (MPI) and runs in $\mathcal{O}(\log_2N)$ time complexity. Utilizing 64 computational cores we obtain a 51x speed-up when compared to a single core. A GitHub link is given which provides access to the code.
Unified Inference for Variational Bayesian Linear Gaussian State-Space Models
Linear Gaussian State-Space Models are widely used and a Bayesian treatment of parameters is therefore of considerable interest. The approximate Variational Bayesian method applied to these models is an attractive approach, used successfully in applications ranging from acoustics to bioinformatics. The most challenging aspect of implementing the method is in performing inference on the hidden state sequence of the model. We show how to convert the inference problem so that standard Kalman Filtering/Smoothing recursions from the literature may be applied. This is in contrast to previously published approaches based on Belief Propagation.
Auxiliary MCMC and particle Gibbs samplers for parallelisable inference in latent dynamical systems
Corenflos, Adrien, Sรคrkkรค, Simo
We introduce two new classes of exact Markov chain Monte Carlo (MCMC) samplers for inference in latent dynamical models. The first one, which we coin auxiliary Kalman samplers, relies on finding a linear Gaussian state-space model approximation around the running trajectory corresponding to the state of the Markov chain. The second, that we name auxiliary particle Gibbs samplers corresponds to deriving good local proposals in an auxiliary Feynman--Kac model for use in particle Gibbs. Both samplers are controlled by augmenting the target distribution with auxiliary observations, resulting in an efficient Gibbs sampling routine. We discuss the relative statistical and computational performance of the samplers introduced, and show how to parallelise the auxiliary samplers along the time dimension. We illustrate the respective benefits and drawbacks of the resulting algorithms on classical examples from the particle filtering literature.
Time Series Clustering with an EM algorithm for Mixtures of Linear Gaussian State Space Models
Umatani, Ryohei, Imai, Takashi, Kawamoto, Kaoru, Kunimasa, Shutaro
In this paper, we consider the task of clustering a set of individual time series while modeling each cluster, that is, model-based time series clustering. The task requires a parametric model with sufficient flexibility to describe the dynamics in various time series. To address this problem, we propose a novel model-based time series clustering method with mixtures of linear Gaussian state space models, which have high flexibility. The proposed method uses a new expectation-maximization algorithm for the mixture model to estimate the model parameters, and determines the number of clusters using the Bayesian information criterion. Experiments on a simulated dataset demonstrate the effectiveness of the method in clustering, parameter estimation, and model selection. The method is applied to real datasets commonly used to evaluate time series clustering methods. Results showed that the proposed method produces clustering results that are as accurate or more accurate than those obtained using previous methods.
Stochastic Gradient MCMC for Nonlinear State Space Models
Aicher, Christopher, Putcha, Srshti, Nemeth, Christopher, Fearnhead, Paul, Fox, Emily B.
State space models (SSMs) provide a flexible framework for modeling complex time series via a latent stochastic process. Inference for nonlinear, non-Gaussian SSMs is often tackled with particle methods that do not scale well to long time series. The challenge is two-fold: not only do computations scale linearly with time, as in the linear case, but particle filters additionally suffer from increasing particle degeneracy with longer series. Stochastic gradient MCMC methods have been developed to scale inference for hidden Markov models (HMMs) and linear SSMs using buffered stochastic gradient estimates to account for temporal dependencies. We extend these stochastic gradient estimators to nonlinear SSMs using particle methods. We present error bounds that account for both buffering error and particle error in the case of nonlinear SSMs that are log-concave in the latent process. We evaluate our proposed particle buffered stochastic gradient using SGMCMC for inference on both long sequential synthetic and minute-resolution financial returns data, demonstrating the importance of this class of methods.
Stochastic Gradient MCMC for State Space Models
Aicher, Christopher, Ma, Yi-An, Foti, Nicholas J., Fox, Emily B.
State space models (SSMs) are a flexible approach to modeling complex time series. However, inference in SSMs is often computationally prohibitive for long time series. Stochastic gradient MCMC (SGMCMC) is a popular method for scalable Bayesian inference for large independent data. Unfortunately when applied to dependent data, such as in SSMs, SGMCMC's stochastic gradient estimates are biased as they break crucial temporal dependencies. To alleviate this, we propose stochastic gradient estimators that control this bias by performing additional computation in a `buffer' to reduce breaking dependencies. Furthermore, we derive error bounds for this bias and show a geometric decay under mild conditions. Using these estimators, we develop novel SGMCMC samplers for discrete, continuous and mixed-type SSMs. Our experiments on real and synthetic data demonstrate the effectiveness of our SGMCMC algorithms compared to batch MCMC, allowing us to scale inference to long time series with millions of time points.
A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning
Fraccaro, Marco, Kamronn, Simon, Paquet, Ulrich, Winther, Ole
This paper takes a step towards temporal reasoning in a dynamically changing video, not in the pixel space that constitutes its frames, but in a latent space that describes the non-linear dynamics of the objects in its world. We introduce the Kalman variational auto-encoder, a framework for unsupervised learning of sequential data that disentangles two latent representations: an object's representation, coming from a recognition model, and a latent state describing its dynamics. As a result, the evolution of the world can be imagined and missing data imputed, both without the need to generate high dimensional frames at each time step. The model is trained end-to-end on videos of a variety of simulated physical systems, and outperforms competing methods in generative and missing data imputation tasks.