Country
Interpolating Earth-science Data using RBF Networks and Mixtures of Experts
We present a mixture of experts (ME) approach to interpolate sparse, spatially correlated earth-science data. Kriging is an interpolation method which uses a global covariation model estimated from the data to take account of the spatial dependence in the data. Based on the close relationship between kriging and the radial basis function (RBF) network (Wan & Bone, 1996), we use a mixture of generalized RBF networks to partition the input space into statistically correlated regions and learn the local covariation model of the data in each region. Applying the ME approach to simulated and real-world data, we show that it is able to achieve good partitioning of the input space, learn the local covariation models and improve generalization.
Learning Exact Patterns of Quasi-synchronization among Spiking Neurons from Data on Multi-unit Recordings
Martignon, Laura, Laskey, Kathryn B., Deco, Gustavo, Vaadia, Eilon
This paper develops arguments for a family of temporal log-linear models to represent spatiotemporal correlations among the spiking events in a group of neurons. The models can represent not just pairwise correlations but also correlations of higher order. Methods are discussed for inferring the existence or absence of correlations and estimating their strength. A frequentist and a Bayesian approach to correlation detection are compared.
Efficient Nonlinear Control with Actor-Tutor Architecture
A new reinforcement learning architecture for nonlinear control is proposed. A direct feedback controller, or the actor, is trained by a value-gradient based controller, or the tutor. This architecture enables both efficient use of the value function and simple computation for real-time implementation. Good performance was verified in multidimensional nonlinear control tasks using Gaussian softmax networks.
Spectroscopic Detection of Cervical Pre-Cancer through Radial Basis Function Networks
Tumer, Kagan, Ramanujam, Nirmala, Richards-Kortum, Rebecca R., Ghosh, Joydeep
The mortality related to cervical cancer can be substantially reduced through early detection and treatment. However, current detection techniques, such as Pap smear and colposcopy, fail to achieve a concurrently high sensitivity and specificity. In vivo fluorescence spectroscopy is a technique which quickly, noninvasively and quantitatively probes the biochemical and morphological changes that occur in precancerous tissue. RBF ensemble algorithms based on such spectra provide automated, and near realtime implementation of pre-cancer detection in the hands of nonexperts. The results are more reliable, direct and accurate than those achieved by either human experts or multivariate statistical algorithms. 1 Introduction Cervical carcinoma is the second most common cancer in women worldwide, exceeded only by breast cancer (Ramanujam et al., 1996). The mortality related to cervical cancer can be reduced if this disease is detected at the precancerous state, known as squamous intraepitheliallesion (SIL). Currently, a Pap smear is used to 982 K. Turner, N. Ramanujam, R. Richards-Kortum and J. Ghosh screen for cervical cancer {Kurman et al., 1994}. In a Pap test, a large number of cells obtained by scraping the cervical epithelium are smeared onto a slide which is then fixed and stained for cytologic examination.
Regression with Input-Dependent Noise: A Bayesian Treatment
Bishop, Christopher M., Quazaz, Cazhaow S.
In most treatments of the regression problem it is assumed that the distribution of target data can be described by a deterministic function of the inputs, together with additive Gaussian noise having constant variance. The use of maximum likelihood to train such models then corresponds to the minimization of a sum-of-squares error function. In many applications a more realistic model would allow the noise variance itself to depend on the input variables. However, the use of maximum likelihood to train such models would give highly biased results. In this paper we show how a Bayesian treatment can allow for an input-dependent variance while overcoming the bias of maximum likelihood.
A Convergence Proof for the Softassign Quadratic Assignment Algorithm
Rangarajan, Anand, Yuille, Alan L., Gold, Steven, Mjolsness, Eric
The softassign quadratic assignment algorithm has recently emerged as an effective strategy for a variety of optimization problems in pattern recognition and combinatorial optimization. While the effectiveness of the algorithm was demonstrated in thousands of simulations, there was no known proof of convergence. Here, we provide a proof of convergence for the most general form of the algorithm.
Spatial Decorrelation in Orientation Tuned Cortical Cells
Dimitrov, Alexander, Cowan, Jack D.
In this paper we propose a model for the lateral connectivity of orientation-selective cells in the visual cortex based on informationtheoretic considerations. We study the properties of the input signal to the visual cortex and find new statistical structures which have not been processed in the retino-geniculate pathway. Applying the idea that the system optimizes the representation of incoming signals, we derive the lateral connectivity that will achieve this for a set of local orientation-selective patches, as well as the complete spatial structure of a layer of such patches. We compare the results with various physiological measurements.
Separating Style and Content
Tenenbaum, Joshua B., Freeman, William T.
We seek to analyze and manipulate two factors, which we call style and content, underlying a set of observations. We fit training data with bilinear models which explicitly represent the two-factor structure. These models can adapt easily during testing to new styles or content, allowing us to solve three general tasks: extrapolation of a new style to unobserved content; classification of content observed in a new style; and translation of new content observed in a new style.
Combinations of Weak Classifiers
To obtain classification systems with both good generalization performance and efficiency in space and time, we propose a learning method based on combinations of weak classifiers, where weak classifiers are linear classifiers (perceptrons) which can do a little better than making random guesses. A randomized algorithm is proposed to find the weak classifiers. They· are then combined through a majority vote. As demonstrated through systematic experiments, the method developed is able to obtain combinations of weak classifiers with good generalization performance and a fast training time on a variety of test problems and real applications.
Adaptively Growing Hierarchical Mixtures of Experts
Fritsch, Jürgen, Finke, Michael, Waibel, Alex
We propose a novel approach to automatically growing and pruning Hierarchical Mixtures of Experts. The constructive algorithm proposed here enables large hierarchies consisting of several hundred experts to be trained effectively. We show that HME's trained by our automatic growing procedure yield better generalization performance than traditional static and balanced hierarchies. Evaluation of the algorithm is performed (1) on vowel classification and (2) within a hybrid version of the JANUS r9] speech recognition system using a subset of the Switchboard large-vocabulary speaker-independent continuous speech recognition database.