Goto

Collaborating Authors

 Learning Graphical Models


Probabilistic Reduced-Order Modeling for Stochastic Partial Differential Equations

arXiv.org Machine Learning

We discuss a Bayesian formulation to coarse-graining (CG) of PDEs where the coefficients (e.g. material parameters) exhibit random, fine scale variability. The direct solution to such problems requires grids that are small enough to resolve this fine scale variability which unavoidably requires the repeated solution of very large systems of algebraic equations. We establish a physically inspired, data-driven coarse-grained model which learns a low- dimensional set of microstructural features that are predictive of the fine-grained model (FG) response. Once learned, those features provide a sharp distribution over the coarse scale effec- tive coefficients of the PDE that are most suitable for prediction of the fine scale model output. This ultimately allows to replace the computationally expensive FG by a generative proba- bilistic model based on evaluating the much cheaper CG several times. Sparsity enforcing pri- ors further increase predictive efficiency and reveal microstructural features that are important in predicting the FG response. Moreover, the model yields probabilistic rather than single-point predictions, which enables the quantification of the unavoidable epistemic uncertainty that is present due to the information loss that occurs during the coarse-graining process.


Measuring Sample Quality with Stein's Method

arXiv.org Machine Learning

To improve the efficiency of Monte Carlo estimation, practitioners are turning to biased Markov chain Monte Carlo procedures that trade off asymptotic exactness for computational speed. The reasoning is sound: a reduction in variance due to more rapid sampling can outweigh the bias introduced. However, the inexactness creates new challenges for sampler and parameter selection, since standard measures of sample quality like effective sample size do not account for asymptotic bias. To address these challenges, we introduce a new computable quality measure based on Stein's method that quantifies the maximum discrepancy between sample and target expectations over a large class of test functions. We use our tool to compare exact, biased, and deterministic sample sequences and illustrate applications to hyperparameter selection, convergence rate assessment, and quantifying bias-variance tradeoffs in posterior inference.


MultiView Diffusion Maps

arXiv.org Machine Learning

In this study we consider learning a reduced dimensionality representation from datasets obtained under multiple views. Such multiple views of datasets can be obtained, for example, when the same underlying process is observed using several different modalities, or measured with different instrumentation. Our goal is to effectively exploit the availability of such multiple views for various purposes, such as non-linear embedding, manifold learning, spectral clustering, anomaly detection and non-linear system identification. Our proposed method exploits the intrinsic relation within each view, as well as the mutual relations between views. We do this by defining a cross-view model, in which an implied Random Walk process between objects is restrained to hop between the different views. Our method is robust to scaling of each dataset, and is insensitive to small structural changes in the data. Within this framework, we define new diffusion distances and analyze the spectra of the implied kernels. We demonstrate the applicability of the proposed approach on both artificial and real data sets.


The Mythos of Model Interpretability

arXiv.org Artificial Intelligence

Supervised machine learning models boast remarkable predictive capabilities. But can you trust your model? Will it work in deployment? What else can it tell you about the world? We want models to be not only good, but interpretable. And yet the task of interpretation appears underspecified. Papers provide diverse and sometimes non-overlapping motivations for interpretability, and offer myriad notions of what attributes render models interpretable. Despite this ambiguity, many papers proclaim interpretability axiomatically, absent further explanation. In this paper, we seek to refine the discourse on interpretability. First, we examine the motivations underlying interest in interpretability, finding them to be diverse and occasionally discordant. Then, we address model properties and techniques thought to confer interpretability, identifying transparency to humans and post-hoc explanations as competing notions. Throughout, we discuss the feasibility and desirability of different notions, and question the oft-made assertions that linear models are interpretable and that deep neural networks are not.



Neural Machine Translation and Sequence-to-sequence Models: A Tutorial

arXiv.org Machine Learning

This tutorial introduces a new and powerful set of techniques variously called "neural machine translation" or "neural sequence-to-sequence models". These techniques have been used in a number of tasks regarding the handling of human language, and can be a powerful tool in the toolbox of anyone who wants to model sequential data of some sort. The tutorial assumes that the reader knows the basics of math and programming, but does not assume any particular experience with neural networks or natural language processing. It attempts to explain the intuition behind the various methods covered, then delves into them with enough mathematical detail to understand them concretely, and culiminates with a suggestion for an implementation exercise, where readers can test that they understood the content in practice.


A Statistical Machine Learning Approach to Yield Curve Forecasting

arXiv.org Machine Learning

Yield curve forecasting is an important problem in finance. In this work we explore the use of Gaussian Processes in conjunction with a dynamic modeling strategy, much like the Kalman Filter, to model the yield curve. Gaussian Processes have been successfully applied to model functional data in a variety of applications. A Gaussian Process is used to model the yield curve. The hyper-parameters of the Gaussian Process model are updated as the algorithm receives yield curve data. Yield curve data is typically available as a time series with a frequency of one day. We compare existing methods to forecast the yield curve with the proposed method. The results of this study showed that while a competing method (a multivariate time series method) performed well in forecasting the yields at the short term structure region of the yield curve, Gaussian Processes perform well in the medium and long term structure regions of the yield curve. Accuracy in the long term structure region of the yield curve has important practical implications. The Gaussian Process framework yields uncertainty and probability estimates directly in contrast to other competing methods. Analysts are frequently interested in this information. In this study the proposed method has been applied to yield curve forecasting, however it can be applied to model high frequency time series data or data streams in other domains.


Autoencoding Variational Inference For Topic Models

arXiv.org Machine Learning

Topic models are one of the most popular methods for learning representations of text, but a major challenge is that any change to the topic model requires mathematically deriving a new inference algorithm. A promising approach to address this problem is autoencoding variational Bayes (AEVB), but it has proven diffi- cult to apply to topic models in practice. We present what is to our knowledge the first effective AEVB based inference method for latent Dirichlet allocation (LDA), which we call Autoencoded Variational Inference For Topic Model (AVITM). This model tackles the problems caused for AEVB by the Dirichlet prior and by component collapsing. We find that AVITM matches traditional methods in accuracy with much better inference time. Indeed, because of the inference network, we find that it is unnecessary to pay the computational cost of running variational optimization on test data. Because AVITM is black box, it is readily applied to new topic models. As a dramatic illustration of this, we present a new topic model called ProdLDA, that replaces the mixture model in LDA with a product of experts. By changing only one line of code from LDA, we find that ProdLDA yields much more interpretable topics, even if LDA is trained via collapsed Gibbs sampling.


An unsupervised bayesian approach for the joint reconstruction and classification of cutaneous reflectance confocal microscopy images

arXiv.org Machine Learning

This paper studies a new Bayesian algorithm for the joint reconstruction and classification of reflectance confocal microscopy (RCM) images, with application to the identification of human skin lentigo. The proposed Bayesian approach takes advantage of the distribution of the multiplicative speckle noise affecting the true reflectivity of these images and of appropriate priors for the unknown model parameters. A Markov chain Monte Carlo (MCMC) algorithm is proposed to jointly estimate the model parameters and the image of true reflectivity while classifying images according to the distribution of their reflectivity. Precisely, a Metropolis-within-Gibbs sampler is investigated to sample the posterior distribution of the Bayesian model associated with RCM images and to build estimators of its parameters, including labels indicating the class of each RCM image. The resulting algorithm is applied to synthetic data and to real images from a clinical study containing healthy and lentigo patients. The lentigo is a hyperplasia that affects the skin.


A Bayesian computer model analysis of Robust Bayesian analyses

arXiv.org Machine Learning

We harness the power of Bayesian emulation techniques, designed to aid the analysis of complex computer models, to examine the structure of complex Bayesian analyses themselves. These techniques facilitate robust Bayesian analyses and/or sensitivity analyses of complex problems, and hence allow global exploration of the impacts of choices made in both the likelihood and prior specification. We show how previously intractable problems in robustness studies can be overcome using emulation techniques, and how these methods allow other scientists to quickly extract approximations to posterior results corresponding to their own particular subjective specification. The utility and flexibility of our method is demonstrated on a reanalysis of a real application where Bayesian methods were employed to capture beliefs about river flow. We discuss the obvious extensions and directions of future research that such an approach opens up.