Uncertainty
Latent Fisher Discriminant Analysis
Linear Discriminant Analysis (LDA) is a well-known method for dimensionality reduction and classification. Previous studies have also extended the binary-class case into multi-classes. However, many applications, such as object detection and keyframe extraction cannot provide consistent instance-label pairs, while LDA requires labels on instance level for training. Thus it cannot be directly applied for semi-supervised classification problem. In this paper, we overcome this limitation and propose a latent variable Fisher discriminant analysis model. We relax the instance-level labeling into bag-level, is a kind of semi-supervised (video-level labels of event type are required for semantic frame extraction) and incorporates a data-driven prior over the latent variables. Hence, our method combines the latent variable inference and dimension reduction in an unified bayesian framework. We test our method on MUSK and Corel data sets and yield competitive results compared to the baseline approach. We also demonstrate its capacity on the challenging TRECVID MED11 dataset for semantic keyframe extraction and conduct a human-factors ranking-based experimental evaluation, which clearly demonstrates our proposed method consistently extracts more semantically meaningful keyframes than challenging baselines.
A finite axiomatization of conditional independence and inclusion dependencies
Hannula, Miika, Kontinen, Juha
We formulate a finite axiomatization of the implication problem for inclusion and conditional independence atoms (dependencies) in the dependence logic context. The input of this problem is given by a finite set ฮฃ {ฯ} consisting of conditional independence atoms and inclusion atoms, and the question to decide is whether the following logical consequence holds ฮฃ ฯ. (1) Independence logic [12] and inclusion logic [6] are recent variants of dependence logic the semantics of which are defined over sets of assigments (teams) rather than a single assignment as in first-order logic.
Integrated Pre-Processing for Bayesian Nonlinear System Identification with Gaussian Processes
Frigola, Roger, Rasmussen, Carl Edward
We introduce GP-FNARX: a new model for nonlinear system identification based on a nonlinear autoregressive exogenous model (NARX) with filtered regressors (F) where the nonlinear regression problem is tackled using sparse Gaussian processes (GP). We integrate data pre-processing with system identification into a fully automated procedure that goes from raw data to an identified model. Both pre-processing parameters and GP hyper-parameters are tuned by maximizing the marginal likelihood of the probabilistic model. We obtain a Bayesian model of the system's dynamics which is able to report its uncertainty in regions where the data is scarce. The automated approach, the modeling of uncertainty and its relatively low computational cost make of GP-FNARX a good candidate for applications in robotics and adaptive control.
Using memristor crossbar structure to implement a novel adaptive real time fuzzy modeling algorithm
Afrakoti, Iman Esmaili Paeen, Shouraki, Saeed Bagheri, Merrikhbayat, Farnood
Although fuzzy techniques promise fast meanwhile accurate modeling and control abilities for complicated systems, different difficulties have been re-vealed in real situation implementations. Usually there is no escape of it-erative optimization based on crisp domain algorithms. Recently memristor structures appeared promising to implement neural network structures and fuzzy algorithms. In this paper a novel adaptive real-time fuzzy modeling algorithm is proposed which uses active learning method concept to mimic recent understandings of right brain processing techniques. The developed method is based on processing fuzzy numbers to provide the ability of being sensitive to each training data point to expand the knowledge tree leading to plasticity while used defuzzification technique guaranties enough stability. An outstanding characteristic of the proposed algorithm is its consistency to memristor crossbar hardware processing concepts. An analog implemen-tation of the proposed algorithm on memristor crossbars structure is also introduced in this paper. The effectiveness of the proposed algorithm in modeling and pattern recognition tasks is verified by means of computer simulations
Exponentially Fast Parameter Estimation in Networks Using Distributed Dual Averaging
Shahrampour, Shahin, Jadbabaie, Ali
In this paper we present an optimization-based view of distributed parameter estimation and observational social learning in networks. Agents receive a sequence of random, independent and identically distributed (i.i.d.) signals, each of which individually may not be informative about the underlying true state, but the signals together are globally informative enough to make the true state identifiable. Using an optimization-based characterization of Bayesian learning as proximal stochastic gradient descent (with Kullback-Leibler divergence from a prior as a proximal function), we show how to efficiently use a distributed, online variant of Nesterov's dual averaging method to solve the estimation with purely local information. When the true state is globally identifiable, and the network is connected, we prove that agents eventually learn the true parameter using a randomized gossip scheme. We demonstrate that with high probability the convergence is exponentially fast with a rate dependent on the KL divergence of observations under the true state from observations under the second likeliest state. Furthermore, our work also highlights the possibility of learning under continuous adaptation of network which is a consequence of employing constant, unit stepsize for the algorithm.
Efficient Monte Carlo Methods for Multi-Dimensional Learning with Classifier Chains
Read, Jesse, Martino, Luca, Luengo, David
Multidimensional classification (MDC) is the supervised learning problem where an instance is associated with multiple classes, rather than with a single class, as in traditional classification problems. Since these classes are often strongly correlated, modeling the dependencies between them allows MDC methods to improve their performance - at the expense of an increased computational cost. In this paper we focus on the classifier chains (CC) approach for modeling dependencies, one of the most popular and highestperforming methods for multi-label classification (MLC), a particular case of MDC which involves only binary classes (i.e., labels). The original CC algorithm makes a greedy approximation, and is fast but tends to propagate errors along the chain. Our algorithms remain tractable for high-dimensional data sets and obtain the best predictive performance across several real data sets. Keywords: classifier chains, multidimensional classification, multi-label classification, Monte Carlo methods, Bayesian inference 1. Introduction Multidimensional classification (MDC) is the supervised learning problem where an instance may be associated with multiple classes, rather than Preprint submitted to Pattern Recognition March 22, 2018 with a single class as in traditional binary or multi-class single-dimensional classification (SDC) problems. So-called MDC (e.g., in [1]) is also known in the literature as multi-target, multi-output [2], or multi-objective [3] classification The recently popularised task of multi-label classification (see [4, 5, 6, 7] for overviews) can be viewed as a particular case of the multidimensional problem that only involves binary classes, i.e., labels that can be turned on (1) or off (0) for any data instance. The MDC learning context is receiving increased attention in the literature, since it arises naturally in a wide variety of domains, such as image classification [8, 9], information retrieval and text categorization [10], automated detection of emotions in music [11] or bioinformatics [10, 12].
Variational Bayes Approximations for Clustering via Mixtures of Normal Inverse Gaussian Distributions
Subedi, Sanjeena, McNicholas, Paul D.
The use of mixture models for clustering, referred to as model-based clustering, has become increasingly popular since the work of Wolfe (1963). A wide variety of finite mixture models has been studied extensively within the literature to date. Amongst these, the Gaussian mixture model has received special attention due to its mathematical tractability and the relative computational simplicity associated with parameter estimation. However, the Gaussian mixture model is not without limitations; for instance, the component densities are restricted to being symmetric.
Compact Representations of Extended Causal Models
Halpern, Joseph Y., Hitchcock, Christopher
One of Judea Pearl's many, many important contributions to the study of causality was the first attempt to use the mathematical tools of causal modeling to give an account of "actual causation", a notion that has been of considerable interest among philosophers and legal theorists (Pearl, 2000, Chapter 10). Pearl later revised his account of actual causation in joint work with Halpern (Halpern & Pearl, 2005). A number of authors (Hall, 2007; Halpern, 2008; Hitchcock, 2007; Menzies, 2004) have suggested that an account of actual causation must be sensitive to considerations of normality, as well as to causal structure. In (Halpern & Hitchcock, 2011), we suggest a way of incorporating considerations of normality into the Halpern-Pearl theory, and show how to extend the account to illuminate features of the psychology of causal judgment, as well as features of causal reasoning in the law. Our account of actual causation makes use of "extended causal models", which include both structural equations among a set of variables, and a partial preorder on possible worlds, which represents the relative "normality" of those worlds. We actually want to think of people as working with the structural equations and normality order to evaluate actual causation. However, consideration of even simple examples immediately suggests a problem. A direct representation of the equations and normality order is too cumbersome for cognitively limited agents to use effectively. If our account of actual causation is to be at all realistic as a model of human causal judgment, some form of compact representation will be needed.
BayesOpt: A Library for Bayesian optimization with Robotics Applications
The purpose of this paper is twofold. On one side, we present a general framework for Bayesian optimization and we compare it with some related fields in active learning and Bayesian numerical analysis. On the other hand, Bayesian optimization and related problems (bandits, sequential experimental design) are highly dependent on the surrogate model that is selected. However, there is no clear standard in the literature. Thus, we present a fast and flexible toolbox that allows to test and combine different models and criteria with little effort. It includes most of the state-of-the-art contributions, algorithms and models. Its speed also removes part of the stigma that Bayesian optimization methods are only good for "expensive functions". The software is free and it can be used in many operating systems and computer languages.
Scalable Probabilistic Entity-Topic Modeling
Houlsby, Neil, Ciaramita, Massimiliano
We present an LDA approach to entity disambiguation. Each topic is associated with a Wikipedia article and topics generate either content words or entity mentions. Training such models is challenging because of the topic and vocabulary size, both in the millions. We tackle these problems using a novel distributed inference and representation framework based on a parallel Gibbs sampler guided by the Wikipedia link graph, and pipelines of MapReduce allowing fast and memory-frugal processing of large datasets. We report state-of-the-art performance on a public dataset.