Plotting

Multiplicative Updating Rule for Blind Separation Derived from the Method of Scoring

Neural Information Processing Systems

The idea is to calculate differentials by using a relative increment instead of an absolute increment in the parameter space. This idea has been extended to compute the relative Hessian by (Pham, 1996).


RCC Cannot Compute Certain FSA, Even with Arbitrary Transfer Functions

Neural Information Processing Systems

The proof given here shows that for any finite, discrete transfer function used by the units of an RCC network, there are finite-state automata (FSA) that the network cannot model, no matter how many units are used. The proof also applies to continuous transfer functions with a finite number of fixed-points, such as sigmoid and radial-basis functions.


Extended ICA Removes Artifacts from Electroencephalographic Recordings

Neural Information Processing Systems

Severe contamination of electroencephalographic (EEG) activity by eye movements, blinks, muscle, heart and line noise is a serious problem for EEG interpretation and analysis. Rejecting contaminated EEG segments results in a considerable loss of information and may be impractical for clinical data. Many methods have been proposed to remove eye movement and blink artifacts from EEG recordings. Often regression in the time or frequency domain is performed on simultaneous EEG and electrooculographic (EOG) recordings to derive parameters characterizing the appearance and spread of EOG artifacts in the EEG channels. However, EOG records also contain brain signals [1, 2], so regressing out EOG activity inevitably involves subtracting a portion of the relevant EEG signal from each recording as well. Regression cannot be used to remove muscle noise or line noise, since these have no reference channels. Here, we propose a new and generally applicable method for removing a wide variety of artifacts from EEG records. The method is based on an extended version of a previous Independent Component Analysis (lCA) algorithm [3, 4] for performing blind source separation on linear mixtures of independent source signals with either sub-Gaussian or super-Gaussian distributions. Our results show that ICA can effectively detect, separate and remove activity in EEG records from a wide variety of artifactual sources, with results comparing favorably to those obtained using regression-based methods.


Stacked Density Estimation

Neural Information Processing Systems

One frequently estimates density functions for which there is little prior knowledge on the shape of the density and for which one wants a flexible and robust estimator (allowing multimodality if it exists). In this context, the methods of choice tend to be finite mixture models and kernel density estimation methods. For mixture modeling, mixtures of Gaussian components are frequently assumed and model choice reduces to the problem of choosing the number k of Gaussian components in the model (Titterington, Smith and Makov, 1986). For kernel density estimation, kernel shapes are typically chosen from a selection of simple unimodal densities such as Gaussian, triangular, or Cauchy densities, and kernel bandwidths are selected in a data-driven manner (Silverman 1986; Scott 1994). As argued by Draper (1996), model uncertainty can contribute significantly to pre- - Also with the Jet Propulsion Laboratory 525-3660, California Institute of Technology, Pasadena, CA 91109 Stacked Density Estimation 669 dictive error in estimation. While usually considered in the context of supervised learning, model uncertainty is also important in unsupervised learning applications such as density estimation. Even when the model class under consideration contains the true density, if we are only given a finite data set, then there is always a chance of selecting the wrong model. Moreover, even if the correct model is selected, there will typically be estimation error in the parameters of that model.


Modeling Complex Cells in an Awake Macaque during Natural Image Viewing

Neural Information Processing Systems

Our model consists of a classical energy mechanism whose output is divided by nonclassical gain control and texture contrast mechanisms. We apply this model to review movies, a stimulus sequence that replicates the stimulation a cell receives during free viewing of natural images. Data were collected from three cells using five different review movies, and the model was fit separately to the data from each movie. For the energy mechanism alone we find modest but significant correlations (rE 0.41, 0.43, 0.59, 0.35) between model and data. These correlations are improved somewhat when we allow for suppressive surround effects (rE G 0.42, 0.56, 0.60, 0.37). In one case the inclusion of a delayed suppressive surround dramatically improves the fit to the data by modifying the time course of the model's response.


Learning Human-like Knowledge by Singular Value Decomposition: A Progress Report

Neural Information Processing Systems

Singular value decomposition (SVD) can be viewed as a method for unsupervised training of a network that associates two classes of events reciprocally by linear connections through a single hidden layer. SVD was used to learn and represent relations among very large numbers of words (20k-60k) and very large numbers of natural text passages (lk-70k) in which they occurred. The result was 100-350 dimensional "semantic spaces" in which any trained or newly aibl word or passage could be represented as a vector, and similarities were measured by the cosine of the contained angle between vectors. Good accmacy in simulating human judgments and behaviors has been demonstrated by performance on multiple-choice vocabulary and domain knowledge tests, emulation of expert essay evaluations, and in several other ways. Examples are also given of how the kind of knowledge extracted by this method can be applied.


Reinforcement Learning for Continuous Stochastic Control Problems

Neural Information Processing Systems

Here we sudy the continuous time, continuous state-space stochastic case, which covers a wide variety of control problems including target, viability, optimization problems (see [FS93], [KP95])}or which a formalism is the following.


Statistical Models of Conditioning

Neural Information Processing Systems

Conditioning experiments probe the ways that animals make predictions about rewards and punishments and use those predictions to control their behavior. One standard model of conditioning paradigms which involve many conditioned stimuli suggests that individual predictions should be added together. Various key results show that this model fails in some circumstances, and motivate an alternative model, in which there is attentional selection between different available stimuli. The new model is a form of mixture of experts, has a close relationship with some other existing psychological suggestions, and is statistically well-founded.


Prior Knowledge in Support Vector Kernels

Neural Information Processing Systems

We explore methods for incorporating prior knowledge about a problem at hand in Support Vector learning machines. We show that both invariances under group transfonnations and prior knowledge about locality in images can be incorporated by constructing appropriate kernel functions.


On the Separation of Signals from Neighboring Cells in Tetrode Recordings

Neural Information Processing Systems

We discuss a solution to the problem of separating waveforms produced bymultiple cells in an extracellular neural recording. We take an explicitly probabilistic approach, using latent-variable models ofvarying sophistication to describe the distribution of waveforms producedby a single cell. The models range from a single Gaussian distribution of waveforms for each cell to a mixture of hidden Markov models. We stress the overall statistical structure of the approach, allowing the details of the generative model chosen to depend on the specific neural preparation.