Directed Networks
On the Design of LQR Kernels for Efficient Controller Learning
Marco, Alonso, Hennig, Philipp, Schaal, Stefan, Trimpe, Sebastian
A core problem of learning control is to determine optimal feedback controllers for (partially) unknown nonlinear systems from experimental data. Reinforcement learning (RL) [1], [2] is a promising framework for this, yet often requires performing many experiments on the physical system to even find suitable controllers, which limits the applicability of such techniques. Therefore, a lot of research effort has been invested into data efficiency of RL aiming at learning controllers from as few experiments as possible. Recently, Bayesian optimization (BO) has been proposed for RL as a promising approach in this direction. BO employs a probabilistic description of the latent objective function (typically a Gaussian process (GP)), which allows for selecting next control experiments in a principled manner, e.g., to maximize information gain [3] or perform safe exploration [4]. While BO provides a promising framework for learning controllers in fairly general settings, the full power of Bayesian learning is often not exploited. A key advantage of Bayesian methods is that they allow for combining prior problem knowledge with learning from data in a principled manner. In case of GP models, this concerns specifically the choice of the kernel, which captures the covariance between function values at different inputs and is thus the core component to model prior knowledge about the function shape. By choosing standard kernels, however, naive BO approaches do often not exploit this opportunity to improve learning performance.
Anthropic decision theory
This paper sets out to resolve how agents ought to act in the Sleeping Beauty problem and various related anthropic (self-locating belief) problems, not through the calculation of anthropic probabilities, but through finding the correct decision to make. It creates an anthropic decision theory (ADT) that decides these problems from a small set of principles. By doing so, it demonstrates that the attitude of agents with regards to each other (selfish or altruistic) changes the decisions they reach, and that it is very important to take this into account. To illustrate ADT, it is then applied to two major anthropic problems and paradoxes, the Presumptuous Philosopher and Doomsday problems, thus resolving some issues about the probability of human extinction.
Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data
We consider the estimation of Dirichlet Process Mixture Models (DPMMs) in distributed environments, where data are distributed across multiple computing nodes. A key advantage of Bayesian nonparametric models such as DPMMs is that they allow new components to be introduced on the fly as needed. This, however, posts an important challenge to distributed estimation -- how to handle new components efficiently and consistently. To tackle this problem, we propose a new estimation method, which allows new components to be created locally in individual computing nodes. Components corresponding to the same cluster will be identified and merged via a probabilistic consolidation scheme. In this way, we can maintain the consistency of estimation with very low communication cost. Experiments on large real-world data sets show that the proposed method can achieve high scalability in distributed and asynchronous environments without compromising the mixing performance.
Online Maximum Likelihood Estimation of the Parameters of Partially Observed Diffusion Processes
Surace, Simone Carlo, Pfister, Jean-Pascal
We revisit the problem of estimating the parameters of a partially observed diffusion process, consisting of a hidden state process and an observed process, with a continuous time parameter. The estimation is to be done online, i.e. the parameter estimate should be updated recursively based on the observation filtration. Here, we use an old but under-exploited representation of the incomplete-data log-likelihood function in terms of the filter of the hidden state from the observations. By performing a stochastic gradient ascent, we obtain a fully recursive algorithm for the time evolution of the parameter estimate. We prove the convergence of the algorithm under suitable conditions regarding the ergodicity of the process consisting of state, filter, and tangent filter. Additionally, our parameter estimation is shown numerically to have the potential of improving suboptimal filters, and can be applied even when the system is not identifiable due to parameter redundancies. Online parameter estimation is a challenging problem that is ubiquitous in fields such as robotics, neuroscience, or finance in order to design adaptive filters and optimal controllers for unknown or changing systems.
Model-Powered Conditional Independence Test
Sen, Rajat, Suresh, Ananda Theertha, Shanmugam, Karthikeyan, Dimakis, Alexandros G., Shakkottai, Sanjay
We consider the problem of non-parametric Conditional Independence testing (CI testing) for continuous random variables. Given i.i.d samples from the joint distribution $f(x,y,z)$ of continuous random vectors $X,Y$ and $Z,$ we determine whether $X \perp Y | Z$. We approach this by converting the conditional independence test into a classification problem. This allows us to harness very powerful classifiers like gradient-boosted trees and deep neural networks. These models can handle complex probability distributions and allow us to perform significantly better compared to the prior state of the art, for high-dimensional CI testing. The main technical challenge in the classification problem is the need for samples from the conditional product distribution $f^{CI}(x,y,z) = f(x|z)f(y|z)f(z)$ -- the joint distribution if and only if $X \perp Y | Z.$ -- when given access only to i.i.d. samples from the true joint distribution $f(x,y,z)$. To tackle this problem we propose a novel nearest neighbor bootstrap procedure and theoretically show that our generated samples are indeed close to $f^{CI}$ in terms of total variational distance. We then develop theoretical results regarding the generalization bounds for classification for our problem, which translate into error bounds for CI testing. We provide a novel analysis of Rademacher type classification bounds in the presence of non-i.i.d near-independent samples. We empirically validate the performance of our algorithm on simulated and real datasets and show performance gains over previous methods.
ZhuSuan: A Library for Bayesian Deep Learning
Shi, Jiaxin, Chen, Jianfei, Zhu, Jun, Sun, Shengyang, Luo, Yucen, Gu, Yihong, Zhou, Yuhao
In this paper we introduce ZhuSuan, a python probabilistic programming library for Bayesian deep learning, which conjoins the complimentary advantages of Bayesian methods and deep learning. ZhuSuan is built upon Tensorflow. Unlike existing deep learning libraries, which are mainly designed for deterministic neural networks and supervised tasks, ZhuSuan is featured for its deep root into Bayesian inference, thus supporting various kinds of probabilistic models, including both the traditional hierarchical Bayesian models and recent deep generative models. We use running examples to illustrate the probabilistic programming on ZhuSuan, including Bayesian logistic regression, variational auto-encoders, deep sigmoid belief networks and Bayesian recurrent neural networks.
A Probabilistic Framework for Nonlinearities in Stochastic Neural Networks
Su, Qinliang, Liao, Xuejun, Carin, Lawrence
We present a probabilistic framework for nonlinearities, based on doubly truncated Gaussian distributions. By setting the truncation points appropriately, we are able to generate various types of nonlinearities within a unified framework, including sigmoid, tanh and ReLU, the most commonly used nonlinearities in neural networks. The framework readily integrates into existing stochastic neural networks (with hidden units characterized as random variables), allowing one for the first time to learn the nonlinearities alongside model weights in these networks. Extensive experiments demonstrate the performance improvements brought about by the proposed framework when integrated with the restricted Boltzmann machine (RBM), temporal RBM and the truncated Gaussian graphical model (TGGM).
Learning Methods for Dynamic Topic Modeling in Automated Behaviour Analysis
Isupova, Olga, Kuzin, Danil, Mihaylova, Lyudmila
Semi-supervised and unsupervised systems provide operators with invaluable support and can tremendously reduce the operators load. In the light of the necessity to process large volumes of video data and provide autonomous decisions, this work proposes new learning algorithms for activity analysis in video. The activities and behaviours are described by a dynamic topic model. Two novel learning algorithms based on the expectation maximisation approach and variational Bayes inference are proposed. Theoretical derivations of the posterior of model parameters are given. The designed learning algorithms are compared with the Gibbs sampling inference scheme introduced earlier in the literature. A detailed comparison of the learning algorithms is presented on real video data. We also propose an anomaly localisation procedure, elegantly embedded in the topic modeling framework. The proposed framework can be applied to a number of areas, including transportation systems, security and surveillance.
Variational Gaussian Approximation for Poisson Data
Arridge, Simon, Ito, Kazufumi, Jin, Bangti, Zhang, Chen
The Poisson model is frequently employed to describe count data, but in a Bayesian context it leads to an analytically intractable posterior probability distribution. In this work, we analyze a variational Gaussian approximation to the posterior distribution arising from the Poisson model with a Gaussian prior. This is achieved by seeking an optimal Gaussian distribution minimizing the Kullback-Leibler divergence from the posterior distribution to the approximation, or equivalently maximizing the lower bound for the model evidence. We derive an explicit expression for the lower bound, and show the existence and uniqueness of the optimal Gaussian approximation. The lower bound functional can be viewed as a variant of classical Tikhonov regularization that penalizes also the covariance. Then we develop an efficient alternating direction maximization algorithm for solving the optimization problem, and analyze its convergence. We discuss strategies for reducing the computational complexity via low rank structure of the forward operator and the sparsity of the covariance. Further, as an application of the lower bound, we discuss hierarchical Bayesian modeling for selecting the hyperparameter in the prior distribution, and propose a monotonically convergent algorithm for determining the hyperparameter. We present extensive numerical experiments to illustrate the Gaussian approximation and the algorithms.