Country
A Maximum-Likelihood Approach to Modeling Multisensory Enhancement
Multisensory response enhancement (MRE) is the augmentation of the response of a neuron to sensory input of one modality by simultaneous input from another modality. The maximum likelihood (ML) model presented here modifies the Bayesian model for MRE (Anastasio et al.) by incorporating a decision strategy to maximize the number of correct decisions. Thus the ML model can also deal with the important tasks of stimulus discrimination and identification in the presence of incongruent visual and auditory cues. It accounts for the inverse effectiveness observed in neurophysiological recording data, and it predicts a functional relation between uni-and bimodal levels of discriminability that is testable both in neurophysiological and behavioral experiments.
Probabilistic principles in unsupervised learning of visual structure: human data and a model
Edelman, Shimon, Hiles, Benjamin P., Yang, Hwajin, Intrator, Nathan
To find out how the representations of structured visual objects depend on the co-occurrence statistics of their constituents, we exposed subjects to a set of composite images with tight control exerted over (1) the conditional probabilities of the constituent fragments, and (2) the value of Barlow's criterion of "suspicious coincidence" (the ratio of joint probability to the product of marginals). We then compared the part verification response times for various probe/target combinations before and after the exposure. For composite probes, the speedup was much larger for targets that contained pairs of fragments perfectly predictive of each other, compared to those that did not. This effect was modulated by the significance of their co-occurrence as estimated by Barlow's criterion. For lone-fragment probes, the speedup in all conditions was generally lower than for composites. These results shed light on the brain's strategies for unsupervised acquisition of structural information in vision.
Modularity in the motor system: decomposition of muscle patterns as combinations of time-varying synergies
The question of whether the nervous system produces movement through the combination of a few discrete elements has long been central to the study of motor control. Muscle synergies, i.e. coordinated patterns of muscle activity, have been proposed as possible building blocks. Here we propose a model based on combinations of muscle synergies with a specific amplitude and temporal structure. Time-varying synergies provide a realistic basis for the decomposition of the complex patterns observed in natural behaviors. To extract time-varying synergies from simultaneous recording of EMG activity we developed an algorithm which extends existing nonnegative matrix factorization techniques.
A Rational Analysis of Cognitive Control in a Speeded Discrimination Task
Mozer, Michael C., Colagrosso, Michael D., Huber, David E.
We are interested in the mechanisms by which individuals monitor and adjust their performance of simple cognitive tasks. We model a speeded discrimination task in which individuals are asked to classify a sequence of stimuli (Jones & Braver, 2001). Response conflict arises when one stimulus class is infrequent relative to another, resulting in more errors and slower reaction times for the infrequent class. How do control processes modulate behavior based on the relative class frequencies? We explain performance from a rational perspective that casts the goal of individuals as minimizing a cost that depends both on error rate and reaction time.
Novel iteration schemes for the Cluster Variation Method
Kappen, Hilbert J., Wiegerinck, Wim
It has been noted by several authors that Belief Propagation can can also give impressive results for graphs that are not trees [2]. The Cluster Variation Method (CVM), is a method that has been developed in the physics community for approximate inference in the Ising model [3]. The CVM approximates the joint probability distribution by a number of (overlapping) marginal distributions (clusters). The quality of the approximation is determined by the size and number of clusters. When the clusters consist of only two variables, the method is known as the Bethe approximation.
Model Based Population Tracking and Automatic Detection of Distribution Changes
Cadez, Igor V., Bradley, P. S.
Probabilistic mixture models are used for a broad range of data analysis tasks such as clustering, classification, predictive modeling, etc. Due to their inherent probabilistic nature, mixture models can easily be combined with other probabilistic or non-probabilistic techniques thus forming more complex data analysis systems. In the case of online data (where there is a stream of data available) models can be constantly updated to reflect the most current distribution of the incoming data. However, in many business applications the models themselves represent a parsimonious summary of the data and therefore it is not desirable to change models frequently, much less with every new data point. In such a framework it becomes crucial to track the applicability of the mixture model and detect the point in time when the model fails to adequately represent the data. In this paper we formulate the problem of change detection and propose a principled solution. Empirical results over both synthetic and real-life data sets are presented.
Dynamic Time-Alignment Kernel in Support Vector Machine
Shimodaira, Hiroshi, Noma, Ken-ichi, Nakai, Mitsuru, Sagayama, Shigeki
A new class of Support Vector Machine (SVM) that is applicable to sequential-pattern recognition such as speech recognition is developed by incorporating an idea of nonlinear time alignment into the kernel function. Since the time-alignment operation of sequential pattern is embedded in the new kernel function, standard SVM training and classification algorithms can be employed without further modifications. The proposed SVM (DTAK-SVM) is evaluated in speaker-dependent speech recognition experiments of hand-segmented phoneme recognition. Preliminary experimental results show comparable recognition performance with hidden Markov models (HMMs).
Grammatical Bigrams
Unsupervised learning algorithms have been derived for several statistical modelsof English grammar, but their computational complexity makesapplying them to large data sets intractable. This paper presents a probabilistic model of English grammar that is much simpler than conventional models, but which admits an efficient EMtraining algorithm. The model is based upon grammatical bigrams, i.e., syntactic relationships between pairs of words. We present the results of experiments that quantify the representational adequacyof the grammatical bigram model, its ability to generalize from labelled data, and its ability to induce syntactic structure from large amounts of raw text. 1 Introduction One of the most significant challenges in learning grammars from raw text is keeping thecomputational complexity manageable. For example, the EM algorithm for the unsupervised training of Probabilistic Context-Free Grammars-known as the Inside-Outside algorithm-has been found in practice to be "computationally intractable for realistic problems" [1].
Agglomerative Multivariate Information Bottleneck
Slonim, Noam, Friedman, Nir, Tishby, Naftali
The information bottleneck method is an unsupervised model independent data organization technique. Given a joint distribution peA, B), this method constructs anew variable T that extracts partitions, or clusters, over the values of A that are informative about B. In a recent paper, we introduced a general principled frameworkfor multivariate extensions of the information bottleneck method that allows us to consider multiple systems of data partitions that are interrelated. In this paper, we present a new family of simple agglomerative algorithms to construct such systems of interrelated clusters. We analyze the behavior of these algorithms and apply them to several real-life datasets.
Bayesian Predictive Profiles With Applications to Retail Transaction Data
Cadez, Igor V., Smyth, Padhraic
Massive transaction data sets are recorded in a routine manner in telecommunications, retail commerce, and Web site management. In this paper we address the problem of inferring predictive individual profilesfrom such historical transaction data. We describe a generative mixture model for count data and use an an approximate Bayesian estimation framework that effectively combines anindividual's specific history with more general population patterns. We use a large real-world retail transaction data set to illustrate how these profiles consistently outperform non-mixture and non-Bayesian techniques in predicting customer behavior in out-of-sample data.