Plotting

 Technology



A Solution for Missing Data in Recurrent Neural Networks with an Application to Blood Glucose Prediction

Neural Information Processing Systems

We consider neural network models for stochastic nonlinear dynamical systems where measurements of the variable of interest are only available at irregular intervals i.e. most realizations are missing. Difficulties arise since the solutions for prediction and maximum likelihood learning with missing data lead to complex integrals, which even for simple cases cannot be solved analytically. In this paper we propose a specific combination of a nonlinear recurrent neural predictive model and a linear error model which leads to tractable prediction and maximum likelihood adaptation rules. In particular, the recurrent neural network can be trained using the real-time recurrent learning rule and the linear error model can be trained by an EM adaptation rule, implemented using forward-backward Kalman filter equations. The model is applied to predict the glucose/insulin metabolism of a diabetic patient where blood glucose measurements are only available a few times a day at irregular intervals.


MELONET I: Neural Nets for Inventing Baroque-Style Chorale Variations

Neural Information Processing Systems

The investigation of neural information structures in music is a rather new, exciting research area bringing together different disciplines such as computer science, mathematics, musicology and cognitive science. One of its aims is to find out what determines the personal style of a composer. It has been shown that neural network models - better than other AI approaches - are able to learn and reproduce styledependent features from given examples, e.g., chorale harmonizations in the style of Johann Sebastian Bach (Hild et al., 1992). However when dealing with melodic sequences, e.g., folksong style melodies, all of these models have considerable difficulties to learn even simple structures. The reason is that they are unable to capture high-order structure such as harmonies, motifs and phrases simultaneously occurring at multiple time scales.


Extended ICA Removes Artifacts from Electroencephalographic Recordings

Neural Information Processing Systems

Severe contamination of electroencephalographic (EEG) activity by eye movements, blinks, muscle, heart and line noise is a serious problem for EEG interpretation and analysis. Rejecting contaminated EEG segments results in a considerable loss of information and may be impractical for clinical data. Many methods have been proposed to remove eye movement and blink artifacts from EEG recordings. Often regression in the time or frequency domain is performed on simultaneous EEG and electrooculographic (EOG) recordings to derive parameters characterizing the appearance and spread of EOG artifacts in the EEG channels. However, EOG records also contain brain signals [1, 2], so regressing out EOG activity inevitably involves subtracting a portion of the relevant EEG signal from each recording as well. Regression cannot be used to remove muscle noise or line noise, since these have no reference channels. Here, we propose a new and generally applicable method for removing a wide variety of artifacts from EEG records. The method is based on an extended version of a previous Independent Component Analysis (lCA) algorithm [3, 4] for performing blind source separation on linear mixtures of independent source signals with either sub-Gaussian or super-Gaussian distributions. Our results show that ICA can effectively detect, separate and remove activity in EEG records from a wide variety of artifactual sources, with results comparing favorably to those obtained using regression-based methods.


Multiresolution Tangent Distance for Affine-invariant Classification

Neural Information Processing Systems

The ability to rely on similarity metrics invariant to image transformations is an important issue for image classification tasks such as face or character recognition. We analyze an invariant metric that has performed well for the latter - the tangent distance - and study its limitations when applied to regular images, showing that the most significant among these (convergence to local minima) can be drastically reduced by computing the distance in a multiresolution setting. This leads to the multi resolution tangent distance, which exhibits significantly higher invariance to image transformations, and can be easily combined with robust estimation procedures.


Agnostic Classification of Markovian Sequences

Neural Information Processing Systems

Classification of finite sequences without explicit knowledge of their statistical nature is a fundamental problem with many important applications. We propose a new information theoretic approach to this problem which is based on the following ingredients: (i) sequences are similar when they are likely to be generated by the same source; (ii) cross entropies can be estimated via "universal compression"; (iii) Markovian sequences can be asymptotically-optimally merged. With these ingredients we design a method for the classification of discrete sequences whenever they can be compressed. We introduce the method and illustrate its application for hierarchical clustering of languages and for estimating similarities of protein sequences.


S-Map: A Network with a Simple Self-Organization Algorithm for Generative Topographic Mappings

Neural Information Processing Systems

The S-Map is a network with a simple learning algorithm that combines the self-organization capability of the Self-Organizing Map (SOM) and the probabilistic interpretability of the Generative Topographic Mapping (GTM). The simulations suggest that the S Map algorithm has a stronger tendency to self-organize from random initial configuration than the GTM. The S-Map algorithm can be further simplified to employ pure Hebbian learning, without changing the qualitative behaviour of the network. 1 Introduction The self-organizing map (SOM; for a review, see [1]) forms a topographic mapping from the data space onto a (usually two-dimensional) output space. The SOM has been succesfully used in a large number of applications [2]; nevertheless, there are some open theoretical questions, as discussed in [1, 3]. Most of these questions arise because of the following two facts: the SOM is not a generative model, i.e. it does not generate a density in the data space, and it does not have a well-defined objective function that the training process would strictly minimize.


Using Expectation to Guide Processing: A Study of Three Real-World Applications

Neural Information Processing Systems

In many real world tasks, only a small fraction of the available inputs are important at any particular time. This paper presents a method for ascertaining the relevance of inputs by exploiting temporal coherence and predictability. The method proposed in this paper dynamically allocates relevance to inputs by using expectations of their future values. As a model of the task is learned, the model is simultaneously extended to create task-specific predictions of the future values of inputs. Inputs which are either not relevant, and therefore not accounted for in the model, or those which contain noise, will not be predicted accurately. These inputs can be de-emphasized, and, in turn, a new, improved, model of the task created. The techniques presented in this paper have yielded significant improvements for the vision-based autonomous control of a land vehicle, vision-based hand tracking in cluttered scenes, and the detection of faults in the etching of semiconductor wafers.



Self-similarity Properties of Natural Images

Neural Information Processing Systems

Scale invariance is a fundamental property of ensembles of natural images [1]. Their non Gaussian properties [15, 16] are less well understood, but they indicate the existence of a rich statistical structure. In this work we present a detailed study of the marginal statistics of a variable related to the edges in the images. A numerical analysis shows that it exhibits extended self-similarity [3, 4, 5]. This is a scaling property stronger than self-similarity: all its moments can be expressed as a power of any given moment. More interesting, all the exponents can be predicted in terms of a multiplicative log-Poisson process. This is the very same model that was used very recently to predict the correct exponents of the structure functions of turbulent flows [6]. These results allow us to study the underlying multifractal singularities. In particular we find that the most singular structures are one-dimensional: the most singular manifold consists of sharp edges.