Goto

Collaborating Authors

 Europe


Speech Recognition with Missing Data using Recurrent Neural Nets

Neural Information Processing Systems

In the'missing data' approach to improving the robustness of automatic speech recognition to added noise, an initial process identifies spectraltemporal regionswhich are dominated by the speech source. The remaining regions are considered to be'missing'. In this paper we develop a connectionist approach to the problem of adapting speech recognition to the missing data case, using Recurrent Neural Networks. In contrast to methods based on Hidden Markov Models, RNNs allow us to make use of long-term time constraints and to make the problems of classification with incomplete data and imputing missing values interact. We report encouraging results on an isolated digit recognition task.


Estimating the Reliability of ICA Projections

Neural Information Processing Systems

When applying unsupervised learning techniques like ICA or temporal decorrelation,a key question is whether the discovered projections arereliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods totackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error.We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly theseparation performance, and, most important, to mark the components that have a actual physical meaning.


EM-DD: An Improved Multiple-Instance Learning Technique

Neural Information Processing Systems

We present a new multiple-instance (MI) learning technique (EM DD) that combines EM with the diverse density (DD) algorithm. EM-DD is a general-purpose MI algorithm that can be applied with boolean or real-value labels and makes real-value predictions. On the boolean Musk benchmarks, the EM-DD algorithm without any tuning significantly outperforms all previous algorithms. EM-DD is relatively insensitive to the number of relevant attributes in the data set and scales up well to large bag sizes. Furthermore, EM DD provides a new framework for MI learning, in which the MI problem is converted to a single-instance setting by using EM to estimate the instance responsible for the label of the bag. 1 Introduction The multiple-instance (MI) learning model has received much attention.


Products of Gaussians

Neural Information Processing Systems

Agakov System Engineering Research Group Chair of Manufacturing Technology Universitiit Erlangen-Niirnberg 91058 Erlangen, Germany F.Agakov@lft·uni-erlangen.de Stephen N. Felderhof Division of Informatics University of Edinburgh Edinburgh EH1 2QL, UK stephenf@dai.ed.ac.uk Abstract Recently Hinton (1999) has introduced the Products of Experts (PoE) model in which several individual probabilistic models for data are combined to provide an overall model of the data. Below weconsider PoE models in which each expert is a Gaussian. Although the product of Gaussians is also a Gaussian, if each Gaussian hasa simple structure the product can have a richer structure. We examine (1) Products of Gaussian pancakes which give rise to probabilistic Minor Components Analysis, (2) products of I-factor PPCA models and (3) a products of experts construction for an AR(l) process. Recently Hinton (1999) has introduced the Products of Experts (PoE) model in which several individual probabilistic models for data are combined to provide an overall model of the data.


Multi Dimensional ICA to Separate Correlated Sources

Neural Information Processing Systems

There are two linear transformations to be considered, one operating inside thechannels (0) and one operating between the different channels (W). The two transformations are estimated in two adjacent leA steps. There are mainly two advantages, that can be taken from the first transformation: (i) By arranging independence among the columns of the transformed patches, the average transinformation betweendifferent channels is decreased.


Bayesian time series classification

Neural Information Processing Systems

This paper proposes an approach to classification of adjacent segments of a time series as being either of classes. We use a hierarchical model that consists of a feature extraction stage and a generative classifier which is built on top of these features. Such two stage approaches are often used in signal and image processing. The novel part of our work is that we link these stages probabilistically by using a latent feature space. To use one joint model is a Bayesian requirement, which has the advantage to fuse information according to its certainty.


Infinite Mixtures of Gaussian Process Experts

Neural Information Processing Systems

We present an extension to the Mixture of Experts (ME) model, where the individual experts are Gaussian Process (GP) regression models. Using aninput-dependent adaptation of the Dirichlet Process, we implement agating network for an infinite number of Experts. Inference in this model may be done efficiently using a Markov Chain relying on Gibbs sampling. The model allows the effective covariance function to vary with the inputs, and may handle large datasets - thus potentially overcoming twoof the biggest hurdles with GP models.


Matching Free Trees with Replicator Equations

Neural Information Processing Systems

Motivated by our recent work on rooted tree matching, in this paper we provide a solution to the problem of matching two free (i.e., unrooted) trees by constructing an association graph whose maximal cliques are in one-to-one correspondence with maximal common subtrees. We then solve the problem using simple replicator dynamics from evolutionary game theory. Experiments on hundreds of uniformly random trees are presented. The results are impressive: despite the inherent inability of these simple dynamics to escape from local optima, they always returned a globally optimal solution.


A Dynamic HMM for On-line Segmentation of Sequential Data

Neural Information Processing Systems

We propose a novel method for the analysis of sequential data that exhibits an inherent mode switching. In particular, the data might be a non-stationary time series from a dynamical system that switches between multiple operating modes. Unlike other approaches, ourmethod processes the data incrementally and without any training of internal parameters. We use an HMM with a dynamically changingnumber of states and an online variant of the Viterbi algorithm that performs an unsupervised segmentation and classification of the data on-the-fly, i.e. the method is able to process incomingdata in real-time. The main idea of the approach is to track and segment changes of the probability density of the data in a sliding window on the incoming data stream.


The Method of Quantum Clustering

Neural Information Processing Systems

We propose a novel clustering method that is an extension of ideas inherent toscale-space clustering and support-vector clustering. Like the latter, itassociates every data point with a vector in Hilbert space, and like the former it puts emphasis on their total sum, that is equal to the scalespace probabilityfunction. The novelty of our approach is the study of an operator in Hilbert space, represented by the Schrödinger equation of which the probability function is a solution. This Schrödinger equation contains a potential function that can be derived analytically from the probability function.