Goto

Collaborating Authors

 Information Technology


Visual Cortex Circuitry and Orientation Tuning

Neural Information Processing Systems

A simple mathematical model for the large-scale circuitry of primary visual cortex is introduced. It is shown that a basic cortical architecture of recurrent local excitation and lateral inhibition can account quantitatively for such properties as orientation tuning. The model can also account for such local effects as cross-orientation suppression. It is also shown that nonlocal state-dependent coupling between similar orientation patches, when added to the model, can satisfactorily reproduce such effects as non-local iso--orientation suppression, and non-local crossorientation enhancement. Following this an account is given of perceptual phenomena involving object segmentation, such as "popout", and the direct and indirect tilt illusions.


Ensemble Methods for Phoneme Classification

Neural Information Processing Systems

There is now considerable interest in using ensembles or committees of learning machines to improve the performance of the system over that of a single learning machine. In most neural network ensembles, the ensemble members are trained on either the same data (Hansen & Salamon 1990) or different subsets of the data (Perrone & Cooper 1993). The ensemble members typically have different initial conditions and/or different architectures. The subsets of the data may be chosen at random, with prior knowledge or by some principled approach e.g.


Support Vector Method for Function Approximation, Regression Estimation and Signal Processing

Neural Information Processing Systems

The Support Vector (SV) method was recently proposed for estimating regressions, constructing multidimensional splines, and solving linear operator equations [Vapnik, 1995]. In this presentation we report results of applying the SV method to these problems.


Using Curvature Information for Fast Stochastic Search

Neural Information Processing Systems

We present an algorithm for fast stochastic gradient descent that uses a nonlinear adaptive momentum scheme to optimize the late time convergence rate. The algorithm makes effective use of curvature information, requires only O(n) storage and computation, and delivers convergence rates close to the theoretical optimum. We demonstrate the technique on linear and large nonlinear backprop networks.


Blind Separation of Delayed and Convolved Sources

Neural Information Processing Systems

We address the difficult problem of separating multiple speakers with multiple microphones in a real room. We combine the work of Torkkola and Amari, Cichocki and Yang, to give Natural Gradient information maximisation rules for recurrent (IIR) networks, blindly adjusting delays, separating and deconvolving mixed signals. While they work well on simulated data, these rules fail in real rooms which usually involve non-minimum phase transfer functions, not-invertible using stable IIR filters. An approach that sidesteps this problem is to perform infomax on a feedforward architecture in the frequency domain (Lambert 1996). We demonstrate real-room separation of two natural signals using this approach.


Clustering Sequences with Hidden Markov Models

Neural Information Processing Systems

This paper discusses a probabilistic model-based approach to clustering sequences, using hidden Markov models (HMMs). The problem can be framed as a generalization of the standard mixture model approach to clustering in feature space. Two primary issues are addressed. First, a novel parameter initialization procedure is proposed, and second, the more difficult problem of determining the number of clusters K, from the data, is investigated. Experimental results indicate that the proposed techniques are useful for revealing hidden cluster structure in data sets of sequences.


The Learning Dynamcis of a Universal Approximator

Neural Information Processing Systems

The learning properties of a universal approximator, a normalized committee machine with adjustable biases, are studied for online back-propagation learning. Within a statistical mechanics framework, numerical studies show that this model has features which do not exist in previously studied two-layer network models without adjustable biases, e.g., attractive suboptimal symmetric phases even for realizable cases and noiseless data. 1 INTRODUCTION Recently there has been much interest in the theoretical breakthrough in the understanding of the online learning dynamics of multi-layer feedforward perceptrons (MLPs) using a statistical mechanics framework. In the seminal paper (Saad & Solla, 1995), a two-layer network with an arbitrary number of hidden units was studied, allowing insight into the learning behaviour of neural network models whose complexity is of the same order as those used in real world applications. The model studied, a soft committee machine (Biehl & Schwarze, 1995), consists of a single hidden layer with adjustable input-hidden, but fixed hidden-output weights. The average learning dynamics of these networks are studied in the thermodynamic limit of infinite input dimensions in a student-teacher scenario, where a stu.dent network is presented serially with training examples (e lS, (IS) labelled by a teacher network of the same architecture but possibly different number of hidden units.


Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning

Neural Information Processing Systems

Figure 2: The task is to move the cart to the origin as quickly as possible without dropping the pole. The bottom three pictures show a trace of the policy execution obtained after one, two, and three trials (shown in increments of 0.5 seconds) Controller Number of data points used Cost from initial state 17 to build the controller LQR


Representing Face Images for Emotion Classification

Neural Information Processing Systems

We compare the generalization performance of three distinct representation schemes for facial emotions using a single classification strategy (neural network). The face images presented to the classifiers are represented as: full face projections of the dataset onto their eigenvectors (eigenfaces); a similar projection constrained to eye and mouth areas (eigenfeatures); and finally a projection of the eye and mouth areas onto the eigenvectors obtained from 32x32 random image patches from the dataset. The latter system achieves 86% generalization on novel face images (individuals the networks were not trained on) drawn from a database in which human subjects consistently identify a single emotion for the face. 1 Introduction Some of the most successful research in machine perception of complex natural image objects (like faces), has relied heavily on reduction strategies that encode an object as a set of values that span the principal component subspace of the object's images [Cottrell and Metcalfe, 1991, Pentland et al., 1994]. This approach has gained wide acceptance for its success in classification, for the efficiency in which the eigenvectors can be calculated, and because the technique permits an implementation that is biologically plausible. The procedure followed in generating these face representations requires normalizing a large set of face views (" mugshots") and from these, identifying a statistically relevant subspace.


Spectroscopic Detection of Cervical Pre-Cancer through Radial Basis Function Networks

Neural Information Processing Systems

The mortality related to cervical cancer can be substantially reduced through early detection and treatment. However, current detection techniques, such as Pap smear and colposcopy, fail to achieve a concurrently high sensitivity and specificity. In vivo fluorescence spectroscopy is a technique which quickly, noninvasively and quantitatively probes the biochemical and morphological changes that occur in precancerous tissue. RBF ensemble algorithms based on such spectra provide automated, and near realtime implementation of pre-cancer detection in the hands of nonexperts. The results are more reliable, direct and accurate than those achieved by either human experts or multivariate statistical algorithms. 1 Introduction Cervical carcinoma is the second most common cancer in women worldwide, exceeded only by breast cancer (Ramanujam et al., 1996). The mortality related to cervical cancer can be reduced if this disease is detected at the precancerous state, known as squamous intraepitheliallesion (SIL). Currently, a Pap smear is used to 982 K. Turner, N. Ramanujam, R. Richards-Kortum and J. Ghosh screen for cervical cancer {Kurman et al., 1994}. In a Pap test, a large number of cells obtained by scraping the cervical epithelium are smeared onto a slide which is then fixed and stained for cytologic examination.