Uncertainty
Bayesian model and dimension reduction for uncertainty propagation: applications in random media
Grigo, Constantin, Koutsourelakis, Phaedon-Stelios
Well-established methods for the solution of stochastic partial differential equations (SPDEs) typically struggle in problems with high-dimensional inputs/outputs. Such difficulties are only amplified in large-scale applications where even a few tens of full-order model runs are impracticable. While dimensionality reduction can alleviate some of these issues, it is not known which and how many features of the (high-dimensional) input are actually predictive of the (high-dimensional) output. In this paper, we advocate a Bayesian formulation that is capable of performing simultaneous dimension and model-order reduction. It consists of a component that encodes the high-dimensional input into a low-dimensional set of feature functions by employing sparsity-enforcing priors and a decoding component that makes use of the solution of a coarse-grained model in order to reconstruct that of the full-order model. Both components are represented with latent variables in a probabilistic graphical model and are simultaneously trained using Stochastic Variational Inference methods. The model is capable of quantifying the predictive uncertainty due to the information loss that unavoidably takes place in any model-order/dimension reduction as well as the uncertainty arising from finite-sized training datasets. We demonstrate its capabilities in the context of random media where fine-scale fluctuations can give rise to random inputs with tens of thousands of variables. With a few tens of full-order model simulations, the proposed model is capable of identifying salient physical features and produce sharp predictions under different boundary conditions of the full output which itself consists of thousands of components.
Artificial Intelligence and Robotics
Andreu-Perez, Javier, Deligianni, Fani, Ravi, Daniele, Yang, Guang-Zhong
The recent successes of AI have captured the wildest imagination of both the scientific communities and the general public. Robotics and AI amplify human potentials, increase productivity and are moving from simple reasoning towards human-like cognitive abilities. Current AI technologies are used in a set area of applications, ranging from healthcare, manufacturing, transport, energy, to financial services, banking, advertising, management consulting and government agencies. The global AI market is around 260 billion USD in 2016 and it is estimated to exceed 3 trillion by 2024. To understand the impact of AI, it is important to draw lessons from it's past successes and failures and this white paper provides a comprehensive explanation of the evolution of AI, its current status and future directions.
Feed-forward Uncertainty Propagation in Belief and Neural Networks
Shekhovtsov, Alexander, Flach, Boris, Busta, Michal
We propose a feed-forward inference method applicable to belief and neural networks. In a belief network, the method estimates an approximate factorized posterior of all hidden units given the input. In neural networks the method propagates uncertainty of the input through all the layers. In neural networks with injected noise, the method analytically takes into account uncertainties resulting from this noise. Such feed-forward analytic propagation is differentiable in parameters and can be trained end-to-end. Compared to standard NN, which can be viewed as propagating only the means, we propagate the mean and variance. The method can be useful in all scenarios that require knowledge of the neuron statistics, e.g. when dealing with uncertain inputs, considering sigmoid activations as probabilities of Bernoulli units, training the models regularized by injected noise (dropout) or estimating activation statistics over the dataset (as needed for normalization methods). In the experiments we show the possible utility of the method in all these tasks as well as its current limitations.
Graphite: Iterative Generative Modeling of Graphs
Grover, Aditya, Zweig, Aaron, Ermon, Stefano
Graphs are a fundamental abstraction for modeling relational data. However, graphs are discrete and combinatorial in nature, and learning representations suitable for machine learning tasks poses statistical and computational challenges. In this work, we propose Graphite an algorithmic framework for unsupervised learning of representations over nodes in a graph using deep latent variable generative models. Our model is based on variational autoencoders (VAE), and differs from existing VAE frameworks for data modalities such as images, speech, and text in the use of graph neural networks for parameterizing both the generative model (i.e., decoder) and inference model (i.e., encoder). The use of graph neural networks directly incorporates inductive biases due to the spatial, local structure of graphs directly in the generative model. Moreover, we draw novel connections between graph neural networks and approximate inference via kernel embeddings of distributions. We demonstrate empirically that Graphite outperforms state-of-the-art approaches for the tasks of density estimation, link prediction, and node classification on synthetic and benchmark datasets.
Pseudo-marginal Bayesian inference for supervised Gaussian process latent variable models
Gadd, Charles, Wade, Sara, Shah, Akeel, Grammatopoulos, Dimitris
We introduce a Bayesian framework for inference with a supervised version of the Gaussian process latent variable model. The framework overcomes the high correlations between latent variables and hyperparameters by using an unbiased pseudo estimate for the marginal likelihood that approximately integrates over the latent variables. This is used to construct a Markov Chain to explore the posterior of the hyperparameters. We demonstrate the procedure on simulated and real examples, showing its ability to capture uncertainty and multimodality of the hyperparameters and improved uncertainty quantification in predictions when compared with variational inference.
Stochastic Variational Inference with Gradient Linearization
Plรถtz, Tobias, Wannenwetsch, Anne S., Roth, Stefan
Variational inference has experienced a recent surge in popularity owing to stochastic approaches, which have yielded practical tools for a wide range of model classes. A key benefit is that stochastic variational inference obviates the tedious process of deriving analytical expressions for closed-form variable updates. Instead, one simply needs to derive the gradient of the log-posterior, which is often much easier. Yet for certain model classes, the log-posterior itself is difficult to optimize using standard gradient techniques. One such example are random field models, where optimization based on gradient linearization has proven popular, since it speeds up convergence significantly and can avoid poor local optima. In this paper we propose stochastic variational inference with gradient linearization (SVIGL). It is similarly convenient as standard stochastic variational inference - all that is required is a local linearization of the energy gradient. Its benefit over stochastic variational inference with conventional gradient methods is a clear improvement in convergence speed, while yielding comparable or even better variational approximations in terms of KL divergence. We demonstrate the benefits of SVIGL in three applications: Optical flow estimation, Poisson-Gaussian denoising, and 3D surface reconstruction.
Estimating causal effects of time-dependent exposures on a binary endpoint in a high-dimensional setting
Asvatourian, Vahรฉ, Coutzac, Clรฉlia, Chaput, Nathalie, Robert, Caroline, Michiels, Stefan, Lanoy, Emilie
Recently, the intervention calculus when the DAG is absent (IDA) method was developed to estimate lower bounds of causal effects from observational high-dimensional data. Originally it was introduced to assess the effect of baseline biomarkers which do not vary over time. However, in many clinical settings, measurements of biomarkers are repeated at fixed time points during treatment exposure and, therefore, this method need to be extended. The purpose of this paper is then to extend the first step of the IDA, the Peter Clarks (PC)-algorithm, to a time-dependent exposure in the context of a binary outcome. We generalised the PC-algorithm for taking into account the chronological order of repeated measurements of the exposure and propose to apply the IDA with our new version, the chronologically ordered PC-algorithm (COPC-algorithm). A simulation study has been performed before applying the method for estimating causal effects of time-dependent immunological biomarkers on toxicity, death and progression in patients with metastatic melanoma. The simulation study showed that the completed partially directed acyclic graphs (CPDAGs) obtained using COPC-algorithm were structurally closer to the true CPDAG than CPDAGs obtained using PC-algorithm. Also, causal effects were more accurate when they were estimated based on CPDAGs obtained using COPC-algorithm. Moreover, CPDAGs obtained by COPC-algorithm allowed removing non-chronologic arrows with a variable measured at a time t pointing to a variable measured at a time t' where t'< t. Bidirected edges were less present in CPDAGs obtained with the COPC-algorithm, supporting the fact that there was less variability in causal effects estimated from these CPDAGs. The COPC-algorithm provided CPDAGs that keep the chronological structure present in the data, thus allowed to estimate lower bounds of the causal effect of time-dependent biomarkers.
Technical Perspective: Expressive Probabilistic Models and Scalable Method of Moments
Across diverse fields, investigators face problems and opportunities involving data. Scientists, scholars, engineers, and other analysts seek new methods to ingest data, extract salient patterns, and then use the results for prediction and understanding. These methods come from machine learning (ML), which is quickly becoming core to modern technological systems, modern scientific workflow, and modern approaches to understanding data. The classical approach to solving a problem with ML follows the "cookbook" approach, one where the scientist shoehorns her data and problem to match the inputs and outputs of a reliable ML method. This strategy has been successful in many domains--examples include spam filtering, speech recognition, and movie recommendation--but it can only take us so far.
Safe end-to-end imitation learning for model predictive control
Lee, Keuntaek, Saigol, Kamil, Theodorou, Evangelos
Abstract-- We propose the use of Bayesian networks, which provide both a mean value and an uncertainty estimate as output, to enhance the safety of learned control policies under circumstances in which a test-time input differs significantly from the training set. Our algorithm combines reinforcement learning and end-to-end imitation learning to simultaneously learn a control policy as well as a threshold over the predictive uncertainty of the learned model, with no hand-tuning required. Corrective action, such as a return of control to the model predictive controller or human expert, is taken when the uncertainty threshold is exceeded. We demonstrate that our method is robust to uncertainty resulting from varying system dynamics as well as from partial state observability. As the deployment of deep neural networks as controllers for physical robotic systems becomes more prevalent, the issue of safety within artificial intelligence becomes an increasingly important concern. Recently the use of end-to-end imitation learning to develop neural network control policies has surged in popularity, due in large part to the ease with which deep models can learn complex dynamics and infer global state from local data while bypassing the need for significant parameter tuning. In contrast, traditional approaches to vision-based control rely on methods such image segmentation and object detection, classification, labeling, and filtering; often, these methods require significant engineering and tuning.