Goto

Collaborating Authors

 Information Technology



Spectroscopic Detection of Cervical Pre-Cancer through Radial Basis Function Networks

Neural Information Processing Systems

The mortality related to cervical cancer can be substantially reduced throughearly detection and treatment. However, current detection techniques, such as Pap smear and colposcopy, fail to achieve a concurrently high sensitivity and specificity. In vivo fluorescence spectroscopy is a technique which quickly, noninvasively andquantitatively probes the biochemical and morphological changes that occur in precancerous tissue. RBF ensemble algorithms based on such spectra provide automated, and near realtime implementationof pre-cancer detection in the hands of nonexperts. Theresults are more reliable, direct and accurate than those achieved by either human experts or multivariate statistical algorithms. 1 Introduction Cervical carcinoma is the second most common cancer in women worldwide, exceeded onlyby breast cancer (Ramanujam et al., 1996). The mortality related to cervical cancer can be reduced if this disease is detected at the precancerous state, known as squamous intraepitheliallesion (SIL).


Maximum Likelihood Blind Source Separation: A Context-Sensitive Generalization of ICA

Neural Information Processing Systems

We cast the problem as one of maximum likelihood density estimation, andin that framework introduce an algorithm that searches for independent components using both temporal and spatial cues. We call the resulting algorithm "Contextual ICA," after the (Bell and Sejnowski 1995) Infomax algorithm, which we show to be a special case of cICA. Because cICA can make use of the temporal structure of its input, it is able separate in a number of situations where standard methods cannot, including sources with low kurtosis, coloredGaussian sources, and sources which have Gaussian histograms. 1 The Blind Source Separation Problem Consider a set of n indepent sources


Combining Neural Network Regression Estimates with Regularized Linear Weights

Neural Information Processing Systems

When combining a set of learned models to form an improved estimator, theissue of redundancy or multicollinearity in the set of models must be addressed. A progression of existing approaches and their limitations with respect to the redundancy is discussed. A new approach, PCR*, based on principal components regression isproposed to address these limitations. An evaluation of the new approach on a collection of domains reveals that: 1) PCR* was the most robust combination method as the redundancy of the learned models increased, 2) redundancy could be handled without eliminating any of the learned models, and 3) the principal components ofthe learned models provided a continuum of "regularized" weights from which PCR* could choose.


Bangs, Clicks, Snaps, Thuds and Whacks: An Architecture for Acoustic Transient Processing

Neural Information Processing Systems

We show how judicious normalization of a time-frequency signal allows an elegant and robust implementation of a correlation algorithm. The algorithm uses binary multiplexing instead of analog-analog multiplication. This removes the need for analog storage and analog-multiplication. Simulations show that the resulting algorithm has the same out-of-sample classification performance (-93% correct) as a baseline template-matching algorithm.


An Apobayesian Relative of Winnow

Neural Information Processing Systems

We study a mistake-driven variant of an online Bayesian learning algorithm(similar to one studied by Cesa-Bianchi, Helmbold, and Panizza [CHP96]). This variant only updates its state (learns) on trials in which it makes a mistake. The algorithm makes binary classifications using a linear-threshold classifier and runs in time linear inthe number of attributes seen by the learner. We have been able to show, theoretically and in simulations, that this algorithm performs well under assumptions quite different from those embodied inthe prior of the original Bayesian algorithm. It can handle situations that we do not know how to handle in linear time with Bayesian algorithms. We expect our techniques to be useful in deriving and analyzing other apobayesian algorithms. 1 Introduction We consider two styles of online learning.


Hidden Markov Decision Trees

Neural Information Processing Systems

We study a time series model that can be viewed as a decision tree with Markov temporal structure. The model is intractable for exact calculations, thus we utilize variational approximations. We consider three different distributions for the approximation: one in which the Markov calculations are performed exactly and the layers of the decision tree are decoupled, one in which the decision tree calculations are performed exactly and the time steps of the Markov chain are decoupled, and one in which a Viterbi-like assumption is made to pick out a single most likely state sequence.


Visual Cortex Circuitry and Orientation Tuning

Neural Information Processing Systems

A simple mathematical model for the large-scale circuitry of primary visualcortex is introduced. It is shown that a basic cortical architecture of recurrent local excitation and lateral inhibition canaccount quantitatively for such properties as orientation tuning.The model can also account for such local effects as cross-orientation suppression. It is also shown that nonlocal state-dependent coupling between similar orientation patches, when added to the model, can satisfactorily reproduce such effects asnon-local iso--orientation suppression, and non-local crossorientation enhancement.Following this an account is given of perceptual phenomena involving object segmentation, such as "popout", and the direct and indirect tilt illusions.


GTM: A Principled Alternative to the Self-Organizing Map

Neural Information Processing Systems

The Self-Organizing Map (SOM) algorithm has been extensively studied and has been applied with considerable success to a wide variety of problems. However, the algorithm is derived from heuristic ideasand this leads to a number of significant limitations. In this paper, we consider the problem of modelling the probability densityof data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. We introduce a novel form of latent variable model, which we call the GTM algorithm (forGenerative Topographic Mapping), which allows general nonlinear transformations from latent space to data space, and which is trained using the EM (expectation-maximization) algorithm. Ourapproach overcomes the limitations of the SOM, while introducing no significant disadvantages. We demonstrate the performance ofthe GTM algorithm on simulated data from flow diagnostics for a multiphase oil pipeline.


Size of Multilayer Networks for Exact Learning: Analytic Approach

Neural Information Processing Systems

The architecture of the network is feedforward, with one hidden layer and several outputs. Starting from a fixed training set, we consider the network as a function of its weights. We derive, for a wide family of transfer functions, a lower and an upper bound on the number of hidden units for exact learning, given the size of the dataset and the dimensions of the input and output spaces. 1 RELATED WORKS The context of our work is rather similar to the well-known results of Baum et al. [1, 2,3,5, 10], but we consider both real inputs and outputs, instead ofthe dichotomies usually addressed. We are interested in learning exactly all the examples of a fixed database, hence our work is different from stating that multilayer networks are universal approximators [6, 8, 9]. Since we consider real outputs and not only dichotomies, it is not straightforward to compare our results to the recent works about the VC-dimension of multilayer networks [11, 12, 13]. Our study is more closely related to several works of Sontag [14, 15], but with different hypotheses on the transfer functions of the units. Finally, our approach is based on geometrical considerations and is close to the model of Coetzee and Stonick [4]. First we define the model of network and the notations and second we develop our analytic approach and prove the fundamental theorem. In the last section, we discuss our point of view and propose some practical consequences of the result.