Uncertainty
Physics-constrained, data-driven discovery of coarse-grained dynamics
Felsberger, L., Koutsourelakis, P. S.
The combination of high-dimensionality and disparity of time scales encountered in many problems in computational physics has motivated the development of coarse-grained (CG) models. In this paper, we advocate the paradigm of data-driven discovery for extract- ing governing equations by employing fine-scale simulation data. In particular, we cast the coarse-graining process under a probabilistic state-space model where the transition law dic- tates the evolution of the CG state variables and the emission law the coarse-to-fine map. The directed probabilistic graphical model implied, suggests that given values for the fine- grained (FG) variables, probabilistic inference tools must be employed to identify the cor- responding values for the CG states and to that end, we employ Stochastic Variational In- ference. We advocate a sparse Bayesian learning perspective which avoids overfitting and reveals the most salient features in the CG evolution law. The formulation adopted enables the quantification of a crucial, and often neglected, component in the CG process, i.e. the pre- dictive uncertainty due to information loss. Furthermore, it is capable of reconstructing the evolution of the full, fine-scale system. We demonstrate the efficacy of the proposed frame- work in high-dimensional systems of random walkers.
Deep learning with t-exponential Bayesian kitchen sinks
Partaourides, Harris, Chatzis, Sotirios
Bayesian learning has been recently considered as an effective means of accounting for uncertainty in trained deep network parameters. This is of crucial importance when dealing with small or sparse training datasets. On the other hand, shallow models that compute weighted sums of their inputs, after passing them through a bank of arbitrary randomized nonlinearities, have been recently shown to enjoy good test error bounds that depend on the number of nonlinearities. Inspired from these advances, in this paper we examine novel deep network architectures, where each layer comprises a bank of arbitrary nonlinearities, linearly combined using multiple alternative sets of weights. We effect model training by means of approximate inference based on a t-divergence measure; this generalizes the Kullback-Leibler divergence in the context of the t-exponential family of distributions. We adopt the t-exponential family since it can more flexibly accommodate real-world data, that entail outliers and distributions with fat tails, compared to conventional Gaussian model assumptions. We extensively evaluate our approach using several challenging benchmarks, and provide comparative results to related state-of-the-art techniques.
Minimally Faithful Inversion of Graphical Models
Webb, Stefan, Golinski, Adam, Zinkov, Robert, Siddharth, N., Rainforth, Tom, Teh, Yee Whye, Wood, Frank
Inference amortization methods allow the sharing of statistical strength across related observations when learning to perform posterior inference. Generally this requires the inversion of the dependency structure in the generative model, as the modeller must design and learn a distribution to approximate the posterior. Previous methods invert the dependency structure in a heuristic way and fail to capture the dependencies in the model, therefore limiting the performance of the eventual inference algorithm. We introduce an algorithm for faithfully and minimally inverting the graphical model structure of any generative model. Such an inversion has two crucial properties: a) it does not encode any independence assertions absent from the model, and b) for a given inversion, it encodes as many true independence assertions as possible. Our algorithm works by simulating variable elimination on the generative model to reparametrize the distribution. We show with experiments how such minimal inversions can assist in performing better inference.
Machine Learning in Robotics - 5 Modern Applications
As the term "machine learning" has heated up, interest in "robotics" (as expressed in Google Trends) has not altered much over the last three years. So how much of a place is there for machine learning in robotics? While only a portion of recent developments in robotics can be credited to developments and uses of machine learning, I've aimed to collect some of the more prominent applications together in this article, along with links and references. Before I delve into machine learning in robotics, go ahead and define "robot". Though at first this might seem simple, it's no easy task to come to an agreement on just what a robot is and what it is not, even amongst roboticists.
Estimating the Spectral Density of Large Implicit Matrices
Adams, Ryan P., Pennington, Jeffrey, Johnson, Matthew J., Smith, Jamie, Ovadia, Yaniv, Patton, Brian, Saunderson, James
Many important problems are characterized by the eigenvalues of a large matrix. For example, the difficulty of many optimization problems, such as those arising from the fitting of large models in statistics and machine learning, can be investigated via the spectrum of the Hessian of the empirical loss function. Network data can be understood via the eigenstructure of a graph Laplacian matrix using spectral graph theory. Quantum simulations and other many-body problems are often characterized via the eigenvalues of the solution space, as are various dynamic systems. However, naive eigenvalue estimation is computationally expensive even when the matrix can be represented; in many of these situations the matrix is so large as to only be available implicitly via products with vectors. Even worse, one may only have noisy estimates of such matrix vector products. In this work, we combine several different techniques for randomized estimation and show that it is possible to construct unbiased estimators to answer a broad class of questions about the spectra of such implicit matrices, even in the presence of noise. We validate these methods on large-scale problems in which graph theory and random matrix theory provide ground truth.
Bayesian inference for bivariate ranks
Guillotte, Simon, Perron, Franรงois, Segers, Johan
A recommender system based on ranks is proposed, where an expert's ranking of a set of objects and a user's ranking of a subset of those objects are combined to make a prediction of the user's ranking of all objects. The rankings are assumed to be induced by latent continuous variables corresponding to the grades assigned by the expert and the user to the objects. The dependence between the expert and user grades is modelled by a copula in some parametric family. Given a prior distribution on the copula parameter, the user's complete ranking is predicted by the mode of the posterior predictive distribution of the user's complete ranking conditional on the expert's complete and the user's incomplete rankings. Various Markov chain Monte-Carlo algorithms are proposed to approximate the predictive distribution or only its mode. The predictive distribution can be obtained exactly for the Farlie-Gumbel-Morgenstern copula family, providing a benchmark for the approximation accuracy of the algorithms. The method is applied to the MovieLens 100k dataset with a Gaussian copula modelling dependence between the expert's and user's grades.
Boosting hazard regression with time-varying covariates
Lee, Donald K. K., Chen, Ningyuan
Consider a left-truncated right-censored survival process whose evolution depends on time-varying covariates. Given functional data samples from the process, we propose a practical boosting procedure for estimating its log-intensity function. Our method does not require any separability assumptions like Cox proportional- or Aalen additive-hazards, thus it can flexibly capture time-covariate interactions. The estimator is consistent if the model is correctly specified; alternatively an oracle inequality can be demonstrated for tree-based models. We use the procedure to shed new light on a question from the operations literature concerning the effect of workload on service rates in an emergency department.
Probabilistic Planning With Influence Diagrams
Lee, Junkyu (University of California, Irvine)
Graphical models provide a powerful framework for reasoning under uncertainty, and an influence diagram (ID) is a graphical model of a sequential decision problem that maximizes the total expected utility of a non-forgetting agent. Relaxing the regular modeling assumptions, an ID can be flexibly extended to general decision scenarios involving a limited memory agent or multi-agents. The approach of probabilistic planning with IDs is expected to gain computational leverage by exploiting the local structure as well as representation flexibility of influence diagram frameworks. My research focuses on graphical model inference for IDs and its application to probabilistic planning, targeting online MDP/POMDP planning as testbeds in the evaluation.
Hawkes Process Inference With Missing Data
Shelton, Christian R. (University of California, Riverside) | Qin, Zhen (University of California, Riverisde) | Shetty, Chandini (University of California, Riverside)
A multivariate Hawkes process is a class of marked point processes: A sample consists of a finite set of events of unbounded random size; each event has a real-valued time and a discrete-valued label (mark). It is self-excitatory: Each event causes an increase in the rate of other events (of either the same or a different label) in the (near) future. Prior work has developed methods for parameter estimation from complete samples. However, just as unobserved variables can increase the modeling power of other probabilistic models, allowing unobserved events can increase the modeling power of point processes. In this paper we develop a method to sample over the posterior distribution of unobserved events in a multivariate Hawkes process. We demonstrate the efficacy of our approach, and its utility in improving predictive power and identifying latent structure in real-world data.
Anytime Anyspace AND/OR Best-First Search for Bounding Marginal MAP
Lou, Qi (University of California, Irvine) | Dechter, Rina (University of California, Irvine) | Ihler, Alexander (University of California, Irvine)
Marginal MAP is a key task in Bayesian inference and decision-making. It is known to be very difficult in general, particularly because the evaluation of each MAP assignment requires solving an internal summation problem. In this paper, we propose a best-first search algorithm that provides anytime upper bounds for marginal MAP in graphical models. It folds the computation of external maximization and internal summation into an AND/OR tree search framework, and solves them simultaneously using a unified best-first search algorithm. The algorithm avoids some unnecessary computation of summation sub-problems associated with MAP assignments, and thus yields significant time savings. Furthermore, our algorithm is able to operate within limited memory. Empirical evaluation on three challenging benchmarks demonstrates that our unified best-first search algorithm using pre-compiled variational heuristics often provides tighter anytime upper bounds compared to those state-of-the-art baselines.