Country
Efficient Estimation of OOMs
Jaeger, Herbert, Zhao, Mingjie, Kolling, Andreas
A standard method to obtain stochastic models for symbolic time series is to train state-emitting hidden Markov models (SE-HMMs) with the Baum-Welch algorithm. Based on observable operator models (OOMs), in the last few months a number of novel learning algorithms for similar purposes have been developed: (1,2) two versions of an "efficiency sharpening" (ES) algorithm, which iteratively improves the statistical efficiency of a sequence of OOM estimators, (3) a constrained gradient descent ML estimator for transition-emitting HMMs (TE-HMMs). We give an overview on these algorithms and compare them with SE-HMM/EM learning on synthetic and real-life data.
Fusion of Similarity Data in Clustering
Lange, Tilman, Buhmann, Joachim M.
Fusing multiple information sources can yield significant benefits to successfully accomplish learning tasks. Many studies have focussed on fusing information in supervised learning contexts. We present an approach to utilize multiple information sources in the form of similarity data for unsupervised learning. Based on similarity information, the clustering task is phrased as a nonnegative matrix factorization problem of a mixture of similarity measurements. The tradeoff between the informativeness of data sources and the sparseness of their mixture is controlled by an entropy-based weighting mechanism. For the purpose of model selection, a stability-based approach is employed to ensure the selection of the most self-consistent hypothesis. The experiments demonstrate the performance of the method on toy as well as real world data sets.
Gradient Flow Independent Component Analysis in Micropower VLSI
Celik, Abdullah, Stanacevic, Milutin, Cauwenberghs, Gert
Gradient flow representation of the traveling wave signals acquired over a miniature (1cm diameter) array of four microphones yields linearly mixed instantaneous observations of the time-differentiated sources, separated and localized by independent component analysis (ICA). The gradient flow and ICA processors each measure 3mm 3mm in 0.5 µm CMOS, and consume 54 µW and 180 µW power, respectively, from a 3 V supply at 16 ks/s sampling rate. Experiments demonstrate perceptually clear (12dB) separation and precise localization of two speech sources presented through speakers positioned at 1.5m from the array on a conference room table. Analysis of the multipath residuals shows that they are spectrally diffuse, and void of the direct path.
Bayesian model learning in human visual perception
Orbán, Gergő, Fiser, Jozsef, Aslin, Richard N., Lengyel, Máté
Humans make optimal perceptual decisions in noisy and ambiguous conditions. Computations underlying such optimal behavior have been shown to rely on probabilistic inference according to generative models whose structure is usually taken to be known a priori. We argue that Bayesian model selection is ideal for inferring similar and even more complex model structures from experience. We find in experiments that humans learn subtle statistical properties of visual scenes in a completely unsupervised manner. We show that these findings are well captured by Bayesian model learning within a class of models that seek to explain observed variables by independent hidden causes.
A Theoretical Analysis of Robust Coding over Noisy Overcomplete Channels
Doi, Eizaburo, Balcan, Doru C., Lewicki, Michael S.
Biological sensory systems are faced with the problem of encoding a high-fidelity sensory signal with a population of noisy, low-fidelity neurons. This problem can be expressed in information theoretic terms as coding and transmitting a multidimensional, analog signal over a set of noisy channels. Previously, we have shown that robust, overcomplete codes can be learned by minimizing the reconstruction error with a constraint on the channel capacity. Here, we present a theoretical analysis that characterizes the optimal linear coder and decoder for one-and twodimensional data. The analysis allows for an arbitrary number of coding units, thus including both under-and over-complete representations, and provides a number of important insights into optimal coding strategies. In particular, we show how the form of the code adapts to the number of coding units and to different data and noise conditions to achieve robustness. We also report numerical solutions for robust coding of highdimensional image data and show that these codes are substantially more robust compared against other image codes such as ICA and wavelets.
Benchmarking Non-Parametric Statistical Tests
Keller, Mikaela, Bengio, Samy, Wong, Siew Y.
Although nonparametric tests have already been proposed for that purpose, statistical significance tests for nonstandard measures (different from the classification error) are less often used in the literature. This paper is an attempt at empirically verifying how these tests compare with more classical tests, on various conditions. More precisely, using a very large dataset to estimate the whole "population", we analyzed the behavior of several statistical test, varying the class unbalance, the compared models, the performance measure, and the sample size. The main result is that providing big enough evaluation sets nonparametric tests are relatively reliable in all conditions.
Active Bidirectional Coupling in a Cochlear Chip
We present a novel cochlear model implemented in analog very large scale integration (VLSI) technology that emulates nonlinear active cochlear behavior. This silicon cochlea includes outer hair cell (OHC) electromotility through active bidirectional coupling (ABC), a mechanism we proposed in which OHC motile forces, through the microanatomical organization of the organ of Corti, realize the cochlear amplifier. Our chip measurements demonstrate that frequency responses become larger and more sharply tuned when ABC is turned on; the degree of the enhancement decreases with input intensity as ABC includes saturation of OHC forces.
Robust Fisher Discriminant Analysis
Kim, Seung-jean, Magnani, Alessandro, Boyd, Stephen
Fisher linear discriminant analysis (LDA) can be sensitive to the problem data. Robust Fisher LDA can systematically alleviate the sensitivity problem by explicitly incorporating a model of data uncertainty in a classification problem and optimizing for the worst-case scenario under this model. The main contribution of this paper is show that with general convex uncertainty models on the problem data, robust Fisher LDA can be carried out using convex optimization. For a certain type of product form uncertainty model, robust Fisher LDA can be carried out at a cost comparable to standard Fisher LDA. The method is demonstrated with some numerical examples. Finally, we show how to extend these results to robust kernel Fisher discriminant analysis, i.e., robust Fisher LDA in a high dimensional feature space.
Subsequence Kernels for Relation Extraction
Mooney, Raymond J., Bunescu, Razvan C.
We present a new kernel method for extracting semantic relations between entities in natural language text, based on a generalization of subsequence kernels. This kernel uses three types of subsequence patterns that are typically employed in natural language to assert relationships between two entities. Experiments on extracting protein interactions from biomedical corpora and top-level relations from newspaper corpora demonstrate the advantages of this approach.