Goto

Collaborating Authors

 Uncertainty


Analyzing Games with Ambiguous Player Types using the ${\rm MINthenMAX}$ Decision Model

arXiv.org Artificial Intelligence

In many common interactive scenarios, participants lack information about other participants, and specifically about the preferences of other participants. In this work, we model an extreme case of incomplete information, which we term games with type ambiguity, where a participant lacks even information enabling him to form a belief on the preferences of others. Under type ambiguity, one cannot analyze the scenario using the commonly used Bayesian framework, and therefore he needs to model the participants using a different decision model. In this work, we present the ${\rm MINthenMAX}$ decision model under ambiguity. This model is a refinement of Wald's MiniMax principle, which we show to be too coarse for games with type ambiguity. We characterize ${\rm MINthenMAX}$ as the finest refinement of the MiniMax principle that satisfies three properties we claim are necessary for games with type ambiguity. This prior-less approach we present her also follows the common practice in computer science of worst-case analysis. Finally, we define and analyze the corresponding equilibrium concept assuming all players follow ${\rm MINthenMAX}$. We demonstrate this equilibrium by applying it to two common economic scenarios: coordination games and bilateral trade. We show that in both scenarios, an equilibrium in pure strategies always exists and we analyze the equilibria.


How Bayesian Inference Works: Tutorial

@machinelearnbot

Bayesian inference is a way to get sharper predictions from your data. It's particularly useful when you don't have as much data as you would like and want to juice every last bit of predictive strength from it. Although it is sometimes described with reverence, Bayesian inference isn't magic or mystical. And even though the math under the hood can get dense, the concepts behind it are completely accessible. In brief, Bayesian inference lets you draw stronger conclusions from your data by folding in what you already know about the answer.


The Mathematics of Machine Learning

#artificialintelligence

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I have observed that some actually lack the necessary mathematical intuition and framework to get useful results. This is the main reason I decided to write this blog post. Recently, there has been an upsurge in the availability of many easy-to-use machine and deep learning packages such as scikit-learn, Weka, Tensorflow, R-caret etc. Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results.


A primer on universal function approximation with deep learning (in Torch and R)

@machinelearnbot

Arthur C. Clarke famously stated that "any sufficiently advanced technology is indistinguishable from magic." No current technology embodies this statement more than neural networks and deep learning. And like any good magic it not only dazzles and inspires but also puts fear into people's hearts. One known property of artificial neural networks (ANNs) is that they are universal function approximators. This means that any mathematical function can be represented by a neural network.


Tensor Decomposition via Variational Auto-Encoder

arXiv.org Machine Learning

Tensor decomposition is an important technique for capturing the high-order interactions among multiway data. Multi-linear tensor composition methods, such as the Tucker decomposition and the CANDECOMP/PARAFAC (CP), assume that the complex interactions among objects are multi-linear, and are thus insufficient to represent nonlinear relationships in data. Another assumption of these methods is that a predefined rank should be known. However, the rank of tensors is hard to estimate, especially for cases with missing values. To address these issues, we design a Bayesian generative model for tensor decomposition. Different from the traditional Bayesian methods, the high-order interactions of tensor entries are modeled with variational auto-encoder. The proposed model takes advantages of Neural Networks and nonparametric Bayesian models, by replacing the multi-linear product in traditional Bayesian tensor decomposition with a complex nonlinear function (via Neural Networks) whose parameters can be learned from data. Experimental results on synthetic data and real-world chemometrics tensor data have demonstrated that our new model can achieve significantly higher prediction performance than the state-of-the-art tensor decomposition approaches.


Gaussian Processes for Survival Analysis

arXiv.org Machine Learning

We introduce a semi-parametric Bayesian model for survival analysis. The model is centred on a parametric baseline hazard, and uses a Gaussian process to model variations away from it nonparametrically, as well as dependence on covariates. As opposed to many other methods in survival analysis, our framework does not impose unnecessary constraints in the hazard rate or in the survival function. Furthermore, our model handles left, right and interval censoring mechanisms common in survival analysis. We propose a MCMC algorithm to perform inference and an approximation scheme based on random Fourier features to make computations faster. We report experimental results on synthetic and real data, showing that our model performs better than competing models such as Cox proportional hazards, ANOVA-DDP and random survival forests.


A nonparametric HMM for genetic imputation and coalescent inference

arXiv.org Machine Learning

Genetic sequence data are well described by hidden Markov models (HMMs) in which latent states correspond to clusters of similar mutation patterns. Theory from statistical genetics suggests that these HMMs are nonhomogeneous (their transition probabilities vary along the chromosome) and have large support for self transitions. We develop a new nonparametric model of genetic sequence data, based on the hierarchical Dirichlet process, which supports these self transitions and nonhomogeneity. Our model provides a parameterization of the genetic process that is more parsimonious than other more general nonparametric models which have previously been applied to population genetics. We provide truncation-free MCMC inference for our model using a new auxiliary sampling scheme for Bayesian nonparametric HMMs. In a series of experiments on male X chromosome data from the Thousand Genomes Project and also on data simulated from a population bottleneck we show the benefits of our model over the popular finite model fastPHASE, which can itself be seen as a parametric truncation of our model. We find that the number of HMM states found by our model is correlated with the time to the most recent common ancestor in population bottlenecks. This work demonstrates the flexibility of Bayesian nonparametrics applied to large and complex genetic data.


Improving variational methods via pairwise linear response identities

arXiv.org Machine Learning

Inference methods are often formulated as variational approximations: these approximations allow easy evaluation of statistics by marginalization or linear response, but these estimates can be inconsistent. We show that by introducing constraints on covariance, one can ensure consistency of linear response with the variational parameters, and in so doing inference of marginal probability distributions is improved. For the Bethe approximation and its generalizations, improvements are achieved with simple choices of the constraints. The approximations are presented as variational frameworks; iterative procedures related to message passing are provided for finding the minima.


Building Machines That Learn and Think Like People

arXiv.org Artificial Intelligence

Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn, and how they learn it. Specifically, we argue that these machines should (a) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (b) ground learning in intuitive theories of physics and psychology, to support and enrich the knowledge that is learned; and (c) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes towards these goals that can combine the strengths of recent neural network advances with more structured cognitive models.


Quick and energy-efficient Bayesian computing of binocular disparity using stochastic digital signals

arXiv.org Artificial Intelligence

Reconstruction of the tridimensional geometry of a visual scene using the binocular disparity information is an important issue in computer vision and mobile robotics, which can be formulated as a Bayesian inference problem. However, computation of the full disparity distribution with an advanced Bayesian model is usually an intractable problem, and proves computationally challenging even with a simple model. In this paper, we show how probabilistic hardware using distributed memory and alternate representation of data as stochastic bitstreams can solve that problem with high performance and energy efficiency. We put forward a way to express discrete probability distributions using stochastic data representations and perform Bayesian fusion using those representations, and show how that approach can be applied to diparity computation. We evaluate the system using a simulated stochastic implementation and discuss possible hardware implementations of such architectures and their potential for sensorimotor processing and robotics.