Goto

Collaborating Authors

 Markov Models


Estimating Convergence of Markov chains with L-Lag Couplings

Neural Information Processing Systems

Markov chain Monte Carlo (MCMC) methods generate samples that are asymptotically distributed from a target distribution of interest as the number of iterations goes to infinity. Various theoretical results provide upper bounds on the distance between the target and marginal distribution after a fixed number of iterations. These upper bounds are on a case by case basis and typically involve intractable quantities, which limits their use for practitioners. We introduce L-lag couplings to generate computable, non-asymptotic upper bound estimates for the total variation or the Wasserstein distance of general Markov chains. We apply L-lag couplings to the tasks of (i) determining MCMC burn-in, (ii) comparing different MCMC algorithms with the same target, and (iii) comparing exact and approximate MCMC.


Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Neural Information Processing Systems

This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points. We introduce the notion of regular evolution by making an hypothesis of Lipschitz-Continuity on the transition and reward functions w.r.t.


Markov Random Fields for Collaborative Filtering

Neural Information Processing Systems

In this paper, we model the dependencies among the items that are recommended to a user in a collaborative-filtering problem via a Gaussian Markov Random Field (MRF). We build upon Besag's auto-normal parameterization and pseudo-likelihood, which not only enables computationally efficient learning, but also connects the areas of MRFs and sparse inverse covariance estimation with autoencoders and neighborhood models, two successful approaches in collaborative filtering. We propose a novel approximation for learning sparse MRFs, where the trade-off between recommendation-accuracy and training-time can be controlled. At only a small fraction of the training-time compared to various baselines, including deep nonlinear models, the proposed approach achieved competitive ranking-accuracy on all three well-known data-sets used in our experiments, and notably a 20% gain in accuracy on the data-set with the largest number of items. Papers published at the Neural Information Processing Systems Conference.


Pseudo-Extended Markov chain Monte Carlo

Neural Information Processing Systems

Sampling from posterior distributions using Markov chain Monte Carlo (MCMC) methods can require an exhaustive number of iterations, particularly when the posterior is multi-modal as the MCMC sampler can become trapped in a local mode for a large number of iterations. In this paper, we introduce the pseudo-extended MCMC method as a simple approach for improving the mixing of the MCMC sampler for multi-modal posterior distributions. On the extended space, the modes of the posterior are connected, which allows the MCMC sampler to easily move between well-separated posterior modes. We demonstrate that the pseudo-extended approach delivers improved MCMC sampling over the Hamiltonian Monte Carlo algorithm on multi-modal posteriors, including Boltzmann machines and models with sparsity-inducing priors. Papers published at the Neural Information Processing Systems Conference.


Differentially Private Markov Chain Monte Carlo

Neural Information Processing Systems

Recent developments in differentially private (DP) machine learning and DP Bayesian learning have enabled learning under strong privacy guarantees for the training data subjects. In this paper, we further extend the applicability of DP Bayesian learning by presenting the first general DP Markov chain Monte Carlo (MCMC) algorithm whose privacy-guarantees are not subject to unrealistic assumptions on Markov chain convergence and that is applicable to posterior inference in arbitrary models. Our algorithm is based on a decomposition of the Barker acceptance test that allows evaluating the Rรฉnyi DP privacy cost of the accept-reject choice. We further show how to improve the DP guarantee through data subsampling and approximate acceptance tests. Papers published at the Neural Information Processing Systems Conference.


Large Scale Markov Decision Processes with Changing Rewards

Neural Information Processing Systems

We consider Markov Decision Processes (MDPs) where the rewards are unknown and may change in an adversarial manner. We provide an algorithm that achieves a regret bound of $O( \sqrt{\tau (\ln S \ln A)T}\ln(T))$, where $S$ is the state space, $A$ is the action space, $\tau$ is the mixing time of the MDP, and $T$ is the number of periods. The algorithm's computational complexity is polynomial in $ S $ and $ A $. We then consider a setting often encountered in practice, where the state space of the MDP is too large to allow for exact solutions. By approximating the state-action occupancy measures with a linear architecture of dimension $d\ll S $, we propose a modified algorithm with a computational complexity polynomial in $d$ and independent of $ S $. We also prove a regret bound for this modified algorithm, which to the best of our knowledge, is the first $\tilde{O}(\sqrt{T})$ regret bound in the large-scale MDP setting with adversarially changing rewards.


Expressive power of tensor-network factorizations for probabilistic modeling

Neural Information Processing Systems

Tensor-network techniques have recently proven useful in machine learning, both as a tool for the formulation of new learning algorithms and for enhancing the mathematical understanding of existing methods. Inspired by these developments, and the natural correspondence between tensor networks and probabilistic graphical models, we provide a rigorous analysis of the expressive power of various tensor-network factorizations of discrete multivariate probability distributions. These factorizations include non-negative tensor-trains/MPS, which are in correspondence with hidden Markov models, and Born machines, which are naturally related to the probabilistic interpretation of quantum circuits. When used to model probability distributions, they exhibit tractable likelihoods and admit efficient learning algorithms. Interestingly, we prove that there exist probability distributions for which there are unbounded separations between the resource requirements of some of these tensor-network factorizations.


Multi-View Reinforcement Learning

Neural Information Processing Systems

This paper is concerned with multi-view reinforcement learning (MVRL), which allows for decision making when agents share common dynamics but adhere to different observation models. We define the MVRL framework by extending partially observable Markov decision processes (POMDPs) to support more than one observation model and propose two solution methods through observation augmentation and cross-view policy transfer. We empirically evaluate our method and demonstrate its effectiveness in a variety of environments. Specifically, we show reductions in sample complexities and computational time for acquiring policies that handle multi-view environments. Papers published at the Neural Information Processing Systems Conference.


A Survey on Deep Learning for Named Entity Recognition

arXiv.org Artificial Intelligence

Named entity recognition (NER) is the task to identify mentions of rigid designators from text belonging to predefined semantic types such as person, location, organization etc. NER always serves as the foundation for many natural language applications such as question answering, text summarization, and machine translation. Early NER systems got a huge success in achieving good performance with the cost of human engineering in designing domain-specific features and rules. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.


Optimizing Medical Treatment for Sepsis in Intensive Care: from Reinforcement Learning to Pre-Trial Evaluation

arXiv.org Artificial Intelligence

Our aim is to establish a framework where reinforcement learning (RL) of optimizing interventions retrospectively allows us a regulatory compliant pathway to prospective clinical testing of the learned policies in a clinical deployment. We focus on infections in intensive care units which are one of the major causes of death and difficult to treat because of the complex and opaque patient dynamics, and the clinically debated, highly-divergent set of intervention policies required by each individual patient, yet intensive care units are naturally data rich. In our work, we build on RL approaches in healthcare ("AI Clinicians"), and learn off-policy continuous dosing policy of pharmaceuticals for sepsis treatment using historical intensive care data under partially observable MDPs (POMDPs). POMPDs capture uncertainty in patient state better by taking in all historical information, yielding an efficient representation, which we investigate through ablations. We compensate for the lack of exploration in our retrospective data by evaluating each encountered state with a best-first tree search. We mitigate state distributional shift by optimizing our policy in the vicinity of the clinicians' compound policy. Crucially, we evaluate our model recommendations using not only conventional policy evaluations but a novel framework that incorporates human experts: a model-agnostic pre-clinical evaluation method to estimate the accuracy and uncertainty of clinician's decisions versus our system recommendations when confronted with the same individual patient history ("shadow mode").