Uncertainty
Normalizing Flows on Riemannian Manifolds
Gemici, Mevlana C., Rezende, Danilo, Mohamed, Shakir
We consider the problem of density estimation on Riemannian manifolds. Density estimation on manifolds has many applications in fluid-mechanics, optics and plasma physics and it appears often when dealing with angular variables (such as used in protein folding, robot limbs, gene-expression) and in general directional statistics. In spite of the multitude of algorithms available for density estimation in the Euclidean spaces $\mathbf{R}^n$ that scale to large n (e.g. normalizing flows, kernel methods and variational approximations), most of these methods are not immediately suitable for density estimation in more general Riemannian manifolds. We revisit techniques related to homeomorphisms from differential geometry for projecting densities to sub-manifolds and use it to generalize the idea of normalizing flows to more general Riemannian manifolds. The resulting algorithm is scalable, simple to implement and suitable for use with automatic differentiation. We demonstrate concrete examples of this method on the n-sphere $\mathbf{S}^n$.
Correlated Random Measures
Ranganath, Rajesh, Blei, David
We develop correlated random measures, random measures where the atom weights can exhibit a flexible pattern of dependence, and use them to develop powerful hierarchical Bayesian nonparametric models. Hierarchical Bayesian nonparametric models are usually built from completely random measures, a Poisson-process based construction in which the atom weights are independent. Completely random measures imply strong independence assumptions in the corresponding hierarchical model, and these assumptions are often misplaced in real-world settings. Correlated random measures address this limitation. They model correlation within the measure by using a Gaussian process in concert with the Poisson process. With correlated random measures, for example, we can develop a latent feature model for which we can infer both the properties of the latent features and their dependency pattern. We develop several other examples as well. We study a correlated random measure model of pairwise count data. We derive an efficient variational inference algorithm and show improved predictive performance on large data sets of documents, web clicks, and electronic health records.
How Bayesian Inference Works
Bayesian inference is a way to get sharper predictions from your data. It's particularly useful when you don't have as much data as you would like and want to juice every last bit of predictive strength from it. Although it is sometimes described with reverence, Bayesian inference isn't magic or mystical. And even though the math under the hood can get dense, the concepts behind it are completely accessible. In brief, Bayesian inference lets you draw stronger conclusions from your data by folding in what you already know about the answer. Bayesian inference is based on the ideas of Thomas Bayes, a nonconformist Presbyterian minister in London about 300 years ago. He wrote two books, one on theology, and one on probability. His work included his now famous Bayes Theorem in raw form, which has since been applied to the problem of inference, the technical term for educated guessing. The popularity of Bayes' ideas was aided immeasurably by another minister, Richard Price.
Introduction to Machine Learning for Developers
Today's developers often hear about leveraging machine learning algorithms in order to build more intelligent applications, but many don't know where to start. One of the most important aspects of developing smart applications is to understand the underlying machine learning models, even if you aren't the person building them. Whether you are integrating a recommendation system into your app or building a chat bot, this guide will help you get started in understanding the basics of machine learning. This introduction to machine learning and list of resources is adapted from my October 2016 talk at ACT-W, a women's tech conference. While this is only a brief definition, machine learning means we can use statistical models and probabilistic algorithms to answer questions so we can make informative decisions based on our data.
Why Machine Learning and Big Data need Behavioral Economists
Researchers from Princeton University received mass media attention when they recently predicted the demise of Facebook. Data scientists at Facebook soon hit back with their own'study:' "In keeping with the scientific principle (used by Princeton) 'correlation equals causation,' our research unequivocally demonstrated that Princeton may be in danger of disappearing entirely." Is it surprising that the original Princeton study found its way onto the front pages of newspapers and magazines across the world? Probably not โ the fact is statistical results with a causal interpretation have a stronger effect on our thinking than non-causal information. What the data scientists at Princeton relied upon in presenting their paper was our individual human inability to think statistically.
Unifying Count-Based Exploration and Intrinsic Motivation
Bellemare, Marc G., Srinivasan, Sriram, Ostrovski, Georg, Schaul, Tom, Saxton, David, Munos, Remi
We consider an agent's uncertainty about its environment and the problem of generalizing this uncertainty across observations. Specifically, we focus on the problem of exploration in non-tabular reinforcement learning. Drawing inspiration from the intrinsic motivation literature, we use density models to measure uncertainty, and propose a novel algorithm for deriving a pseudo-count from an arbitrary density model. This technique enables us to generalize count-based exploration algorithms to the non-tabular case. We apply our ideas to Atari 2600 games, providing sensible pseudo-counts from raw pixels. We transform these pseudo-counts into intrinsic rewards and obtain significantly improved exploration in a number of hard games, including the infamously difficult Montezuma's Revenge.
Using Social Dynamics to Make Individual Predictions: Variational Inference with a Stochastic Kinetic Model
Xu, Zhen, Dong, Wen, Srihari, Sargur
Social dynamics is concerned primarily with interactions among individuals and the resulting group behaviors, modeling the temporal evolution of social systems via the interactions of individuals within these systems. In particular, the availability of large-scale data from social networks and sensor networks offers an unprecedented opportunity to predict state-changing events at the individual level. Examples of such events include disease transmission, opinion transition in elections, and rumor propagation. Unlike previous research focusing on the collective effects of social systems, this study makes efficient inferences at the individual level. In order to cope with dynamic interactions among a large number of individuals, we introduce the stochastic kinetic model to capture adaptive transition probabilities and propose an efficient variational inference algorithm the complexity of which grows linearly --- rather than exponentially --- with the number of individuals. To validate this method, we have performed epidemic-dynamics experiments on wireless sensor network data collected from more than ten thousand people over three years. The proposed algorithm was used to track disease transmission and predict the probability of infection for each individual. Our results demonstrate that this method is more efficient than sampling while nonetheless achieving high accuracy.
EM Algorithm and Stochastic Control in Economics
Kou, Steven, Peng, Xianhua, Xu, Xingbo
Generalising the idea of the classical EM algorithm that is widely used for computing maximum likelihood estimates, we propose an EM-Control (EM-C) algorithm for solving multi-period finite time horizon stochastic control problems. The new algorithm sequentially updates the control policies in each time period using Monte Carlo simulation in a forward-backward manner; in other words, the algorithm goes forward in simulation and backward in optimization in each iteration. Similar to the EM algorithm, the EM-C algorithm has the monotonicity of performance improvement in each iteration, leading to good convergence properties. We demonstrate the effectiveness of the algorithm by solving stochastic control problems in the monopoly pricing of perishable assets and in the study of real business cycle.
Communication-Efficient Distributed Statistical Inference
Jordan, Michael I., Lee, Jason D., Yang, Yun
We present a Communication-efficient Surrogate Likelihood (CSL) framework for solving distributed statistical inference problems. CSL provides a communication-efficient surrogate to the global likelihood that can be used for low-dimensional estimation, high-dimensional regularized estimation and Bayesian inference. For low-dimensional estimation, CSL provably improves upon naive averaging schemes and facilitates the construction of confidence intervals. For high-dimensional regularized estimation, CSL leads to a minimax-optimal estimator with controlled communication cost. For Bayesian inference, CSL can be used to form a communication-efficient quasi-posterior distribution that converges to the true posterior. This quasi-posterior procedure significantly improves the computational efficiency of MCMC algorithms even in a non-distributed setting. We present both theoretical analysis and experiments to explore the properties of the CSL approximation.