Goto

Collaborating Authors

 Bayesian Learning


A unifying framework for relaxations of the causal assumptions in Bell's theorem

arXiv.org Machine Learning

Bell's Theorem shows that quantum mechanical correlations can violate the constraints that the causal structure of certain experiments impose on any classical explanation. It is thus natural to ask to which degree the causal assumptions -- e.g. locality or measurement independence -- have to be relaxed in order to allow for a classical description of such experiments. Here, we develop a conceptual and computational framework for treating this problem. We employ the language of Bayesian networks to systematically construct alternative causal structures and bound the degree of relaxation using quantitative measures that originate from the mathematical theory of causality. The main technical insight is that the resulting problems can often be expressed as computationally tractable linear programs. We demonstrate the versatility of the framework by applying it to a variety of scenarios, ranging from relaxations of the measurement independence, locality and bilocality assumptions, to a novel causal interpretation of CHSH inequality violations.


rFerns: An Implementation of the Random Ferns Method for General-Purpose Machine Learning

arXiv.org Artificial Intelligence

Random ferns is a machine learning algorithm proposed by [11] for matching same elements between two images of the same scene, allowing one to recognise certain objects or trace them on videos. The original motivation behind this method was to create a simple and efficient algorithm by extending the Naรฏve Bayes classifier; still the authors acknowledged its strong connection to the decision tree ensembles like the Random forest [2] algorithm. Since introduction, Random ferns have been applied in numerous computer vision application, like image recognition [1], action recognition [10] or augmented reality [14]. However, it has not gathered attention outside this field; thus, this work aims to bring this algorithm to a much wider spectrum of applications. In order to do that, I propose a generalised version of the algorithm, implemented as an R [13] package rFerns. The paper is organised as follows. Section 2 briefly recalls the Bayesian derivation of the original version of Random ferns, presents the decision tree ensemble interpretation of the algorithm and lists modifications leading to the rFerns variant.


A framework for studying synaptic plasticity with neural spike train data

arXiv.org Machine Learning

Synaptic plasticity is believed to be the fundamental building block of learning and memory in the brain. Its study is of crucial importance to understanding the activity and function of neural circuits. With innovations in neural recording technology providing access to the simultaneous activity of increasingly large populations of neurons, statistical models are promising tools for formulating and testing hypotheses about the dynamics of synaptic connectivity. Advances in optical techniques (Packer et al., 2012; Hochbaum et al., 2014), for example, have made it possible to simultaneously record from and stimulate large populations of synaptically connected neurons. Armed with statistical tools capable of inferring time-varying synaptic connectivity, neuroscientists could test competing models of synaptic plasticity, discover new learning rules at the monosynaptic and network level, investigate the effects of disease on synaptic plasticity, and potentially design stimuli to modify neural networks. Despite the popularity of GLMs for spike data, relatively little work has attempted to model the time-varying nature of neural interactions. Here we model interaction weights as a dynamical system governed by parametric synaptic plasticity rules. To perform inference in this model, we use particle Markov Chain Monte Carlo (pMCMC) (Andrieu et al., 2010), a recently developed inference technique for complex time series. We use this new modeling framework to examine the problem of using recorded data to distinguish between proposed variants of spike-timing-dependent plasticity (STDP) learning rules.


Error Rate Bounds and Iterative Weighted Majority Voting for Crowdsourcing

arXiv.org Machine Learning

Crowdsourcing has become an effective and popular tool for human-powered computation to label large datasets. Since the workers can be unreliable, it is common in crowdsourcing to assign multiple workers to one task, and to aggregate the labels in order to obtain results of high quality. In this paper, we provide finite-sample exponential bounds on the error rate (in probability and in expectation) of general aggregation rules under the Dawid-Skene crowdsourcing model. The bounds are derived for multi-class labeling, and can be used to analyze many aggregation methods, including majority voting, weighted majority voting and the oracle Maximum A Posteriori (MAP) rule. We show that the oracle MAP rule approximately optimizes our upper bound on the mean error rate of weighted majority voting in certain setting. We propose an iterative weighted majority voting (IWMV) method that optimizes the error rate bound and approximates the oracle MAP rule. Its one step version has a provable theoretical guarantee on the error rate. The IWMV method is intuitive and computationally simple. Experimental results on simulated and real data show that IWMV performs at least on par with the state-of-the-art methods, and it has a much lower computational cost (around one hundred times faster) than the state-of-the-art methods.


A unified view of generative models for networks: models, methods, opportunities, and challenges

arXiv.org Machine Learning

These efforts have produced a diverse ecology of models and methods. Despite this diversity, many of these models share a common underlying structure: pairwise interactions (edges) are generated with probability conditional on latent vertex attributes. Differences between models generally stem from different philosophical choices about how to learn from data or different empirically-motivated goals. The highly interdisciplinary nature of work on these generative models, however, has inhibited the development of a unified view of their similarities and differences. For instance, novel theoretical models and optimization techniques developed in machine learning are largely unknown within the social and biological sciences, which have instead emphasized model interpretability. Here, we describe a unified view of generative models for networks that draws together many of these disparate threads and highlights the fundamental similarities and differences that span these fields. We then describe a number of opportunities and challenges for future work that are revealed by this view.


Dynamic Programming for Instance Annotation in Multi-instance Multi-label Learning

arXiv.org Machine Learning

Labeling data for classification requires significant human effort. To reduce labeling cost, instead of labeling every instance, a group of instances (bag) is labeled by a single bag label. Computer algorithms are then used to infer the label for each instance in a bag, a process referred to as instance annotation. This task is challenging due to the ambiguity regarding the instance labels. We propose a discriminative probabilistic model for the instance annotation problem and introduce an expectation maximization framework for inference, based on the maximum likelihood approach. For many probabilistic approaches, brute-force computation of the instance label posterior probability given its bag label is exponential in the number of instances in the bag. Our key contribution is a dynamic programming method for computing the posterior that is linear in the number of instances. We evaluate our methods using both benchmark and real world data sets, in the domain of bird song, image annotation, and activity recognition. In many cases, the proposed framework outperforms, sometimes significantly, the current state-of-the-art MIML learning methods, both in instance label prediction and bag label prediction.


Spatiotemporal Sparse Bayesian Learning with Applications to Compressed Sensing of Multichannel Physiological Signals

arXiv.org Machine Learning

Energy consumption is an important issue in continuous wireless telemonitoring of physiological signals. Compressed sensing (CS) is a promising framework to address it, due to its energy-efficient data compression procedure. However, most CS algorithms have difficulty in data recovery due to non-sparsity characteristic of many physiological signals. Block sparse Bayesian learning (BSBL) is an effective approach to recover such signals with satisfactory recovery quality. However, it is time-consuming in recovering multichannel signals, since its computational load almost linearly increases with the number of channels. This work proposes a spatiotemporal sparse Bayesian learning algorithm to recover multichannel signals simultaneously. It not only exploits temporal correlation within each channel signal, but also exploits inter-channel correlation among different channel signals. Furthermore, its computational load is not significantly affected by the number of channels. The proposed algorithm was applied to brain computer interface (BCI) and EEG-based driver's drowsiness estimation. Results showed that the algorithm had both better recovery performance and much higher speed than BSBL. Particularly, the proposed algorithm ensured that the BCI classification and the drowsiness estimation had little degradation even when data were compressed by 80%, making it very suitable for continuous wireless telemonitoring of multichannel signals.


Detecting change points in the large-scale structure of evolving networks

arXiv.org Machine Learning

Interactions among people or objects are often dynamic in nature and can be represented as a sequence of networks, each providing a snapshot of the interactions over a brief period of time. An important task in analyzing such evolving networks is change-point detection, in which we both identify the times at which the large-scale pattern of interactions changes fundamentally and quantify how large and what kind of change occurred. Here, we formalize for the first time the network change-point detection problem within an online probabilistic learning framework and introduce a method that can reliably solve it. This method combines a generalized hierarchical random graph model with a Bayesian hypothesis test to quantitatively determine if, when, and precisely how a change point has occurred. We analyze the detectability of our method using synthetic data with known change points of different types and magnitudes, and show that this method is more accurate than several previously used alternatives. Applied to two high-resolution evolving social networks, this method identifies a sequence of change points that align with known external "shocks" to these networks.


Joint modeling of multiple time series via the beta process with application to motion capture segmentation

arXiv.org Machine Learning

We propose a Bayesian nonparametric approach to the problem of jointly modeling multiple related time series. Our model discovers a latent set of dynamical behaviors shared among the sequences, and segments each time series into regions defined by a subset of these behaviors. Using a beta process prior, the size of the behavior set and the sharing pattern are both inferred from data. We develop Markov chain Monte Carlo (MCMC) methods based on the Indian buffet process representation of the predictive distribution of the beta process. Our MCMC inference algorithm efficiently adds and removes behaviors via novel split-merge moves as well as data-driven birth and death proposals, avoiding the need to consider a truncated model. We demonstrate promising results on unsupervised segmentation of human motion capture data.


Exact Estimation of Multiple Directed Acyclic Graphs

arXiv.org Machine Learning

This paper considers the problem of estimating the structure of multiple related directed acyclic graph (DAG) models. Building on recent developments in exact estimation of DAGs using integer linear programming (ILP), we present an ILP approach for joint estimation over multiple DAGs, that does not require that the vertices in each DAG share a common ordering. Furthermore, we allow also for (potentially unknown) dependency structure between the DAGs. Results are presented on both simulated data and fMRI data obtained from multiple subjects.