Sen, Deborshee
Targeted stochastic gradient Markov chain Monte Carlo for hidden Markov models with rare latent states
Ou, Rihui, Sen, Deborshee, Young, Alexander L, Dunson, David B
Markov chain Monte Carlo (MCMC) algorithms for hidden Markov models often rely on the forward-backward sampler. This makes them computationally slow as the length of the time series increases, motivating the recent development of sub-sampling-based approaches. These approximate the full posterior by using small random subsequences of the data at each MCMC iteration within stochastic gradient MCMC. In the presence of imbalanced data resulting from rare latent states, subsequences often exclude rare latent state data, leading to inaccurate inference and prediction/detection of rare events. We propose a targeted sub-sampling (TASS) approach that over-samples observations corresponding to rare latent states when calculating the stochastic gradient of parameters associated with them. TASS uses an initial clustering of the data to construct subsequence weights that reduce the variance in gradient estimation. This leads to improved sampling efficiency, in particular in settings where the rare latent states correspond to extreme observations. We demonstrate substantial gains in predictive and inferential accuracy on real and synthetic examples.
Bayesian neural networks and dimensionality reduction
Sen, Deborshee, Papamarkou, Theodore, Dunson, David
In conducting non-linear dimensionality reduction and feature learning, it is common to suppose that the data lie near a lower-dimensional manifold. A class of model-based approaches for such problems includes latent variables in an unknown non-linear regression function; this includes Gaussian process latent variable models and variational auto-encoders (VAEs) as special cases. VAEs are artificial neural networks (ANNs) that employ approximations to make computation tractable; however, current implementations lack adequate uncertainty quantification in estimating the parameters, predictive densities, and lower-dimensional subspace, and can be unstable and lack interpretability in practice. We attempt to solve these problems by deploying Markov chain Monte Carlo sampling algorithms (MCMC) for Bayesian inference in ANN models with latent variables. We address issues of identifiability by imposing constraints on the ANN parameters as well as by using anchor points. This is demonstrated on simulated and real data examples. We find that current MCMC sampling schemes face fundamental challenges in neural networks involving latent variables, motivating new research directions.