Goto

Collaborating Authors

 Genre


Extended ICA Removes Artifacts from Electroencephalographic Recordings

Neural Information Processing Systems

Severe contamination of electroencephalographic (EEG) activity by eye movements, blinks, muscle, heart and line noise is a serious problem for EEG interpretation and analysis. Rejecting contaminated EEGsegments results in a considerable loss of information and may be impractical for clinical data. Many methods have been proposed to remove eye movement and blink artifacts from EEG recordings. Often regression in the time or frequency domain is performed on simultaneous EEG and electrooculographic (EOG) recordings to derive parameters characterizing the appearance and spread of EOG artifacts in the EEG channels. However, EOG records also contain brain signals [1, 2], so regressing out EOG activity inevitablyinvolves subtracting a portion of the relevant EEG signal from each recording as well. Regression cannot be used to remove muscle noise or line noise, since these have no reference channels. Here, we propose a new and generally applicable method for removing a wide variety of artifacts from EEG records. The method is based on an extended version of a previous Independent ComponentAnalysis (lCA) algorithm [3, 4] for performing blind source separation on linear mixtures of independent source signals with either sub-Gaussian or super-Gaussian distributions. Our results show that ICA can effectively detect, separate and remove activityin EEG records from a wide variety of artifactual sources, with results comparing favorably to those obtained using regression-based methods.


Phase Transitions and the Perceptual Organization of Video Sequences

Neural Information Processing Systems

Estimating motion in scenes containing multiple moving objects remains a difficult problem in computer vision. A promising approach tothis problem involves using mixture models, where the motion of each object is a component in the mixture. However, existing methodstypically require specifying in advance the number of components in the mixture, i.e. the number of objects in the scene.


Hybrid NN/HMM-Based Speech Recognition with a Discriminant Neural Feature Extraction

Neural Information Processing Systems

In this paper, we present a novel hybrid architecture for continuous speech recognition systems. It consists of a continuous HMM system extended by an arbitrary neural network that is used as a preprocessor that takes several frames of the feature vector as input to produce more discriminative featurevectors with respect to the underlying HMM system. This hybrid system is an extension of a state-of-the-art continuous HMM system, andin fact, it is the first hybrid system that really is capable ofoutperforming thesestandard systems with respect to the recognition accuracy. Experimental results show an relative error reduction of about 10% that we achieved on a remarkably good recognition system based on continuous HMMsfor the Resource Management 1OOO-word continuous speech recognition task.


Active Data Clustering

Neural Information Processing Systems

Active data clustering is a novel technique for clustering of proximity datawhich utilizes principles from sequential experiment design in order to interleave data generation and data analysis. The proposed activedata sampling strategy is based on the expected value of information, a concept rooting in statistical decision theory. This is considered to be an important step towards the analysis of largescale datasets, because it offers a way to overcome the inherent data sparseness of proximity data.


Globally Optimal On-line Learning Rules

Neural Information Processing Systems

We present a method for determining the globally optimal online learning rule for a soft committee machine under a statistical mechanics framework. This work complements previous results on locally optimal rules, where only the rate of change in generalization the total reduction inerror was considered. We maximize the whole learning process and show howgeneralization error over the resulting rule can significantly outperform the locally optimal rule. 1 Introduction We consider a learning scenario in which a feed-forward neural network model (the an unknown mapping (the teacher), given a set of training examplesstudent) emulates The performance of the student network is typicallyproduced by the teacher. A common form of training is online learning, where training patterns are presented sequentially and independently to the network at each learning step. This form of training can be beneficial in terms of both storage and computation time, especially for large systems.



Adaptation in Speech Motor Control

Neural Information Processing Systems

Human subjects are known to adapt their motor behavior to a shift of the visual field brought about by wearing prism glasses over their eyes. We have studied the analog of this effect in speech. U sing a device that can feed back transformed speech signals in real time, we exposed subjects to alterations of their own speech feedback. We found that speakers learn to adjust their production of a vowel to compensate for feedback alterations that change the vowel's perceived phonetic identity; moreover, the effect generalizes across consonant contexts and to different vowels. 1 INTRODUCTION For more than a century, it has been know that humans will adapt their reaches to altered visual feedback [8]. One of the most studied examples of this adaptation is prism adaptation, which is seen when a subject reaches to targets while wearing image-shifting prism glasses [2]. Initially, the subject misses the targets, but he soon learns to compensate and reach accurately.


Approximating Posterior Distributions in Belief Networks Using Mixtures

Neural Information Processing Systems

Exact inference in densely connected Bayesian networks is computationally intractable,and so there is considerable interest in developing effective approximation schemes. One approach which has been adopted is to bound the log likelihood using a mean-field approximating distribution. While this leads to a tractable algorithm, the mean field distribution is assumed tobe factorial and hence unimodal. In this paper we demonstrate the feasibility of using a richer class of approximating distributions based on mixtures of mean field distributions. We derive an efficient algorithm for updating the mixture parameters and apply it to the problem of learning insigmoid belief networks. Our results demonstrate a systematic improvement over simple mean field theory as the number of mixture components is increased.


Reinforcement Learning with Hierarchies of Machines

Neural Information Processing Systems

We present a new approach to reinforcement learning in which the policies consideredby the learning process are constrained by hierarchies of partially specified machines. This allows for the use of prior knowledge to reduce the search space and provides a framework in which knowledge can be transferred across problems and in which component solutions can be recombined to solve larger and more complicated problems. Our approach can be seen as providing a link between reinforcement learning and"behavior-based" or "teleo-reactive" approaches to control. We present provably convergent algorithms for problem-solving and learning withhierarchical machines and demonstrate their effectiveness on a problem with several thousand states.


A Solution for Missing Data in Recurrent Neural Networks with an Application to Blood Glucose Prediction

Neural Information Processing Systems

Volker Tresp and Thomas Briegel * Siemens AG Corporate Technology Otto-Hahn-Ring 6 81730 Miinchen, Germany Abstract We consider neural network models for stochastic nonlinear dynamical systems where measurements of the variable of interest are only available atirregular intervals i.e. most realizations are missing. Difficulties arise since the solutions for prediction and maximum likelihood learning withmissing data lead to complex integrals, which even for simple cases cannot be solved analytically. In this paper we propose a specific combinationof a nonlinear recurrent neural predictive model and a linear error model which leads to tractable prediction and maximum likelihood adaptation rules. In particular, the recurrent neural network can be trained using the real-time recurrent learning rule and the linear error model can be trained by an EM adaptation rule, implemented using forward-backwardKalman filter equations. The model is applied to predict the glucose/insulin metabolism of a diabetic patient where blood glucose measurements are only available a few times a day at irregular intervals.