Goto

Collaborating Authors

 Uncertainty


How Bayesian Inference Works

#artificialintelligence

Bayesian inference is a way to get sharper predictions from your data. It's particularly useful when you don't have as much data as you would like and want to juice every last bit of predictive strength from it. Although it is sometimes described with reverence, Bayesian inference isn't magic or mystical. And even though the math under the hood can get dense, the concepts behind it are completely accessible. In brief, Bayesian inference lets you draw stronger conclusions from your data by folding in what you already know about the answer.


ProMoca: Probabilistic Modeling and Analysis of Agents in Commitment Protocols

Journal of Artificial Intelligence Research

Social commitment protocols regulate interactions of agents in multiagent systems. Several methods have been developed to analyze properties of commitment protocols. However, analysis of an agent's behavior in a commitment protocol, which should take into account the agent's goals and beliefs, has received less attention. In this paper we present ProMoca framework to address this issue. Firstly, we develop an expressive formal language to model agents with respect to their commitments. Our language provides dedicated elements to define commitment protocols, and model agents in terms of their goals, behaviors, and beliefs. Furthermore, our language provides probabilistic and non-deterministic elements to model uncertainty in agents' beliefs. Secondly, we identify two essential properties of an agent with respect to a commitment protocol, namely compliance and goal satisfaction. We formalize these properties using a probabilistic variant of linear temporal logic. Thirdly, we adapt a probabilistic model checking algorithm to automatically analyze compliance and goal satisfaction properties. Finally, we present empirical results about efficiency and scalability of ProMoca.


Robust Variational Inference

arXiv.org Machine Learning

Variational inference is a powerful tool for approximate inference. However, it mainly focuses on the evidence lower bound as variational objective and the development of other measures for variational inference is a promising area of research. This paper proposes a robust modification of evidence and a lower bound for the evidence, which is applicable when the majority of the training set samples are random noise objects. We provide experiments for variational autoencoders to show advantage of the objective over the evidence lower bound on synthetic datasets obtained by adding uninformative noise objects to MNIST and OMNIGLOT. Additionally, for the original MNIST and OMNIGLOT datasets we observe a small improvement over the non-robust evidence lower bound.


Quantum Machine Learning

arXiv.org Machine Learning

Massachusetts Institute of Technology, Department of Mechanical Engineering, Cambridge MA 02139 USA Recent progress implies that a crossover between machine learning and quantum information processing benefits both fields. Traditional machine learning has dramatically improved the benchmarking and control of experimental quantum computing systems, including adaptive quantum phase estimation and designing quantum computing gates. On the other hand, quantum mechanics offers tantalizing prospects to enhance machine learning, ranging from reduced computational complexity to improved generalization performance. The most notable examples include quantum enhanced algorithms for principal component analysis, quantum support vector machines, and quantum Boltzmann machines. Progress has been rapid, fostered by demonstrations of midsized quantum optimizers which are predicted to soon outperform their classical counterparts. Further, we are witnessing the emergence of a physical theory pinpointing the fundamental and natural limitations of learning. Here we survey the cutting edge of this merger and list several open problems. Machine learning has fundamentally changed the way humans interact with and relate to data. Applications range from self-driving cars to intelligent agents capable of exceeding the best humans at Jeopardy and Go. These applications exhibit large data sets and push current algorithms and computational resources to their limit. Information is fundamentally governed by the laws of physics. The laws are quantum mechanical at the scales of present day information processing technology, in contrast to the more familiar'classical' physics at the human scale. The interface of quantum physics and machine learning naturally goes both ways: machine learning algorithms find application in understanding and controlling quantum systems and, on the other hand, quantum computational devices promise enhancement of the performance of machine learning algorithms for problems beyond the reach of classical computing.


Learning without recall in directed circles and rooted trees

arXiv.org Machine Learning

This work investigates the case of a network of agents that attempt to learn some unknown state of the world amongst the finitely many possibilities. At each time step, agents all receive random, independently distributed private signals whose distributions are dependent on the unknown state of the world. However, it may be the case that some or any of the agents cannot distinguish between two or more of the possible states based only on their private observations, as when several states result in the same distribution of the private signals. In our model, the agents form some initial belief (probability distribution) about the unknown state and then refine their beliefs in accordance with their private observations, as well as the beliefs of their neighbors. An agent learns the unknown state when her belief converges to a point mass that is concentrated at the true state. A rational agent would use the Bayes' rule to incorporate her neighbors' beliefs and own private signals over time. While such repeated applications of the Bayes' rule in networks can become computationally intractable, in this paper, we show that in the canonical cases of directed star, circle or path networks and their combinations, one can derive a class of memoryless update rules that replicate that of a single Bayesian agent but replace the self beliefs with the beliefs of the neighbors. This way, one can realize an exponentially fast rate of learning similar to the case of Bayesian (fully rational) agents. The proposed rules are a special case of the Learning without Recall.


Machine Learning on Human Connectome Data from MRI

arXiv.org Machine Learning

Functional MRI (fMRI) and diffusion MRI (dMRI) are non-invasive imaging modalities that allow in-vivo analysis of a patient's brain network (known as a connectome). Use of these technologies has enabled faster and better diagnoses and treatments of neurological disorders and a deeper understanding of the human brain. Recently, researchers have been exploring the application of machine learning models to connectome data in order to predict clinical outcomes and analyze the importance of subnetworks in the brain. Connectome data has unique properties, which present both special challenges and opportunities when used for machine learning. The purpose of this work is to review the literature on the topic of applying machine learning models to MRI-based connectome data. This field is growing rapidly and now encompasses a large body of research. To summarize the research done to date, we provide a comparative, structured summary of 77 relevant works, tabulated according to different criteria, that represent the majority of the literature on this topic. (We also published a living version of this table online at http://connectomelearning.cs.sfu.ca that the community can continue to contribute to.) After giving an overview of how connectomes are constructed from dMRI and fMRI data, we discuss the variety of machine learning tasks that have been explored with connectome data. We then compare the advantages and drawbacks of different machine learning approaches that have been employed, discussing different feature selection and feature extraction schemes, as well as the learning models and regularization penalties themselves. Throughout this discussion, we focus particularly on how the methods are adapted to the unique nature of graphical connectome data. Finally, we conclude by summarizing the current state of the art and by outlining what we believe are strategic directions for future research.


Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning

arXiv.org Machine Learning

We propose a simple algorithm to train stochastic neural networks to draw samples from given target distributions for probabilistic inference. Our method is based on iteratively adjusting the neural network parameters so that the output changes along a Stein variational gradient that maximumly decreases the KL divergence with the target distribution. Our method works for any target distribution specified by their unnormalized density function, and can train any black-box architectures that are differentiable in terms of the parameters we want to adapt. As an application of our method, we propose an amortized MLE algorithm for training deep energy model, where a neural sampler is adaptively trained to approximate the likelihood function. Our method mimics an adversarial game between the deep energy model and the neural sampler, and obtains realistic-looking images competitive with the state-of-the-art results.


Infinite Variational Autoencoder for Semi-Supervised Learning

arXiv.org Machine Learning

This paper presents an infinite variational autoencoder (VAE) whose capacity adapts to suit the input data. This is achieved using a mixture model where the mixing coefficients are modeled by a Dirichlet process, allowing us to integrate over the coefficients when performing inference. Critically, this then allows us to automatically vary the number of autoencoders in the mixture based on the data. Experiments show the flexibility of our method, particularly for semi-supervised learning, where only a small number of training samples are available.


Parsimonious modeling with Information Filtering Networks

arXiv.org Machine Learning

We introduce a methodology to construct parsimonious probabilistic models. This method makes use of Information Filtering Networks to produce a robust estimate of the global sparse inverse covariance from a simple sum of local inverse covariances computed on small sub-parts of the network. Being based on local and low-dimensional inversions, this method is computationally very efficient and statistically robust even for the estimation of inverse covariance of high-dimensional, noisy and short time-series. Applied to financial data our method results computationally more efficient than state-of-the-art methodologies such as Glasso producing, in a fraction of the computation time, models that can have equivalent or better performances but with a sparser inference structure. We also discuss performances with sparse factor models where we notice that relative performances decrease with the number of factors. The local nature of this approach allows us to perform computations in parallel and provides a tool for dynamical adaptation by partial updating when the properties of some variables change without the need of recomputing the whole model. This makes this approach particularly suitable to handle big datasets with large numbers of variables. Examples of practical application for forecasting, stress testing and risk allocation in financial systems are also provided.


Poisson Random Fields for Dynamic Feature Models

arXiv.org Machine Learning

We present the Wright-Fisher Indian buffet process (WF-IBP), a probabilistic model for time-dependent data assumed to have been generated by an unknown number of latent features. This model is suitable as a prior in Bayesian nonparametric feature allocation models in which the features underlying the observed data exhibit a dependency structure over time. More specifically, we establish a new framework for generating dependent Indian buffet processes, where the Poisson random field model from population genetics is used as a way of constructing dependent beta processes. Inference in the model is complex, and we describe a sophisticated Markov Chain Monte Carlo algorithm for exact posterior simulation. We apply our construction to develop a nonparametric focused topic model for collections of time-stamped text documents and test it on the full corpus of NIPS papers published from 1987 to 2015.