Uncertainty
Bethe Learning of Conditional Random Fields via MAP Decoding
Tang, Kui, Ruozzi, Nicholas, Belanger, David, Jebara, Tony
Many machine learning tasks can be formulated in terms of predicting structured outputs. In frameworks such as the structured support vector machine (SVM-Struct) and the structured per-ceptron, discriminative functions are learned by iteratively applying efficient maximum a posteri-ori (MAP) decoding. However, maximum likelihood estimation (MLE) of probabilistic models over these same structured spaces requires computing partition functions, which is generally intractable. This paper presents a method for learning discrete exponential family models using the Bethe approximation to the MLE. Remarkably, this problem also reduces to iterative (MAP) decoding. This connection emerges by combining the Bethe approximation with a Frank-Wolfe (FW) algorithm on a convex dual objective which circumvents the intractable partition function. The result is a new single loop algorithm MLE-Struct, which is substantially more efficient than previous double-loop methods for approximate maximum likelihood estimation. Our algorithm outperforms existing methods in experiments involving image segmentation, matching problems from vision, and a new dataset of university roommate assignments.
Heteroscedastic Treed Bayesian Optimisation
Assael, John-Alexander M., Wang, Ziyu, Shahriari, Bobak, de Freitas, Nando
Optimising black-box functions is important in many disciplines, such as tuning machine learning models, robotics, finance and mining exploration. Bayesian optimisation is a state-of-the-art technique for the global optimisation of black-box functions which are expensive to evaluate. At the core of this approach is a Gaussian process prior that captures our belief about the distribution over functions. However, in many cases a single Gaussian process is not flexible enough to capture non-stationarity in the objective function. Consequently, heteroscedasticity negatively affects performance of traditional Bayesian methods. In this paper, we propose a novel prior model with hierarchical parameter learning that tackles the problem of non-stationarity in Bayesian optimisation. Our results demonstrate substantial improvements in a wide range of applications, including automatic machine learning and mining exploration.
Connectedness of graphs and its application to connected matroids through covering-based rough sets
Graph theoretical ideas are highly utilized by computer science fields especially data mining. In this field, a data structure can be designed in the form of tree. Covering is a widely used form of data representation in data mining and covering-based rough sets provide a systematic approach to this type of representation. In this paper, we study the connectedness of graphs through covering-based rough sets and apply it to connected matroids. First, we present an approach to inducing a covering by a graph, and then study the connectedness of the graph from the viewpoint of the covering approximation operators. Second, we construct a graph from a matroid, and find the matroid and the graph have the same connectedness, which makes us to use covering-based rough sets to study connected matroids. In summary, this paper provides a new approach to studying graph theory and matroid theory.
Dependence space of matroids and its application to attribute reduction
Attribute reduction is a basic issue in knowledge representation and data mining. Rough sets provide a theoretical foundation for the issue. Matroids generalized from matrices have been widely used in many fields, particularly greedy algorithm design, which plays an important role in attribute reduction. Therefore, it is meaningful to combine matroids with rough sets to solve the optimization problems. In this paper, we introduce an existing algebraic structure called dependence space to study the reduction problem in terms of matroids. First, a dependence space of matroids is constructed. Second, the characterizations for the space such as consistent sets and reducts are studied through matroids. Finally, we investigate matroids by the means of the space and present two expressions for their bases. In a word, this paper provides new approaches to study attribute reduction.
Optimality of Poisson processes intensity learning with Gaussian processes
Kirichenko, Alisa, van Zanten, Harry
In this paper we provide theoretical support for the so-called "Sigmoidal Gaussian Cox Process" approach to learning the intensity of an inhomogeneous Poisson process on a $d$-dimensional domain. This method was proposed by Adams, Murray and MacKay (ICML, 2009), who developed a tractable computational approach and showed in simulation and real data experiments that it can work quite satisfactorily. The results presented in the present paper provide theoretical underpinning of the method. In particular, we show how to tune the priors on the hyper parameters of the model in order for the procedure to automatically adapt to the degree of smoothness of the unknown intensity and to achieve optimal convergence rates.
Modeling Ecological Integrity with Bayesian Belief Networks
Barrios, Juan M. (National Commission for Knowledge and Use of Biodiversity) | Sierra-Alcocer, Raรบl (National Commission for Knowledge and Use of Biodiversity) | Gonzรกlez-Salazar, Constantino (National Commission for Knowledge and Use of Biodiversity) | Mora, Franz E. (National Commission for Knowledge and Use of Biodiversity) | Munguรญa, Mariana (National Commission for Knowledge and Use of Biodiversity) | Pรฉrez-Maqueo, Octavio M. (National Commission for Knowledge and Use of Biodiversity) | Trejo, Isabel (National Commission for Knowledge and Use of Biodiversity)
Although the concept of ecological integrity is referred in many country legislations there is no consensus on how to formalize and implement it. One possible definition is as the capacity of an ecosystem to support and maintain a balanced, integrated, and adaptive community of organisms having a species composition, diversity, and functional organization comparable to that of a natural habitat of the region. Our objective is to model this interpretation of ecological integrity from a set of ecological measures that can be estimated from ecological inventory data.
Nonparametric Bayesian Learning of Other Agents' Policies in Interactive POMDPs
Panella, Alessandro (University of Illinois at Chicago) | Gmytrasiewicz, Piotr (University of Illinois at Chicago)
We consider an autonomous agent facing a partially observable, stochastic, multiagent environment where the unknown policies of other agents are represented as finite state controllers (FSCs). We show how an agent can (i) learn the FSCs of the other agents, and (ii) exploit these models during interactions. To separate the issues of off-line versus on-line learning we consider here an off-line two-phase approach. During the first phase the agent observes as the other player(s) are interacting with the environment (the observations may be imperfect and the learning agent is not taking part in the interaction.) The collected data is used to learn an ensemble of FSCs that explain the behavior of the other agent(s) using a Bayesian non-parametric (BNP) approach. We verify the quality of the learned models during the second phase by allowing the agent to compute its own optimal policy and interact with the observed agent. The optimal policy for the learning agent is obtained by solving an interactive POMDP in which the states are augmented by the other agent(s)' possible FSCs. The advantage of using the Bayesian nonparametric approach in the first phase is that the complexity (number of nodes) of the learned controllers is not bounded a priori. Our two-phase approach is preliminary and separates the learning using BNP from the complexities of learning on-line while the other agent may be modifying its policy (on-line approach is subject of our future work.) We describe our implementation and results in a multiagent Tiger domain. Our results show that learning improves the agent's performance, which increases with the amount of data collected during the learning phase.
Supervised learning sets benchmark for robust spike detection from calcium imaging signals
Theis, Lucas, Berens, Philipp, Froudarakis, Emmanouil, Reimer, Jacob, Rosรณn, Miroslav Romรกn, Baden, Tom, Euler, Thomas, Tolias, Andreas, Bethge, Matthias
A fundamental challenge in calcium imaging has been to infer the timing of action potentials from the measured noisy calcium fluorescence traces. We systematically evaluate a range of spike inference algorithms on a large benchmark dataset recorded from varying neural tissue (V1 and retina) using different calcium indicators (OGB-1 and GCamp6). We show that a new algorithm based on supervised learning in flexible probabilistic models outperforms all previously published techniques, setting a new standard for spike inference from calcium signals. Importantly, it performs better than other algorithms even on datasets not seen during training. Future data acquired in new experimental conditions can easily be used to further improve its spike prediction accuracy and generalization performance. Finally, we show that comparing algorithms on artificial data is not informative about performance on real population imaging data, suggesting that a benchmark dataset may greatly facilitate future algorithmic developments.
Generative Modeling of Hidden Functional Brain Networks
Nandy, Shaurabh, Golden, Richard M.
Resting-state functional connectivity fMRI data is a derivative of the unobservable neuronal functional network structure of the human brain. This data is subject to multiple sources of noise such as thermal noise, system noise, and physiological noise. Commonly used methods to infer the latent network structure, such as thresholding methods, make the implicit assumption that weak links are not as important as strong links, and that links are conditionally independent. However, such assumptions provide an incomplete description of the biology. Additionally, despite a core set of observations about functional networks such as smallworldness, modularity, exponentially truncated degree distributions, and presence of various types of hubs, very little is known about the computational principles which can give rise to these observations. This paper presents a Hidden Markov Random Field framework for the purpose of representing, estimating, and evaluating latent neuronal functional relationships using fMRI data. The main theoretical contributions of this paper are summarized as follows.
Covariance Matrices and Influence Scores for Mean Field Variational Bayes
Giordano, Ryan, Broderick, Tamara
Mean field variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets. However, it is well known that a major failing of MFVB is that it underestimates the uncertainty of model variables (sometimes severely) and provides no information about model variable covariance. We develop a fast, general methodology for exponential families that augments MFVB to deliver accurate uncertainty estimates for model variables -- both for individual variables and coherently across variables. MFVB for exponential families defines a fixed-point equation in the means of the approximating posterior, and our approach yields a covariance estimate by perturbing this fixed point. Inspired by linear response theory, we call our method linear response variational Bayes (LRVB). We also show how LRVB can be used to quickly calculate a measure of the influence of individual data points on parameter point estimates. We demonstrate the accuracy and scalability of our method by learning Gaussian mixture models for both simulated and real data.