Function Approximation with the Sweeping Hinge Algorithm

Neural Information Processing Systems

We present a computationally efficient algorithm for function approximation with piecewise linear sigmoidal nodes. A one hidden layer network is constructed one node at a time using the method of fitting the residual. The task of fitting individual nodes is accomplished using a new algorithm that searchs for the best fit by solving a sequence of Quadratic Programming problems. This approach offers significant advantages over derivative-based search algorithms (e.g.


Hybrid NN/HMM-Based Speech Recognition with a Discriminant Neural Feature Extraction

Neural Information Processing Systems

In this paper, we present a novel hybrid architecture for continuous speech recognition systems. It consists of a continuous HMM system extended by an arbitrary neural network that is used as a preprocessor that takes several frames of the feature vector as input to produce more discriminative feature vectors with respect to the underlying HMM system. This hybrid system is an extension of a state-of-the-art continuous HMM system, and in fact, it is the first hybrid system that really is capable of outperforming these standard systems with respect to the recognition accuracy. Experimental results show an relative error reduction of about 10% that we achieved on a remarkably good recognition system based on continuous HMMs for the Resource Management 1 OOO-word continuous speech recognition task.


An Improved Policy Iteration Algorithm for Partially Observable MDPs

Neural Information Processing Systems

A new policy iteration algorithm for partially observable Markov decision processes is presented that is simpler and more efficient than an earlier policy iteration algorithm of Sondik (1971,1978). The key simplification is representation of a policy as a finite-state controller. This representation makes policy evaluation straightforward. The paper's contribution is to show that the dynamic-programming update used in the policy improvement step can be interpreted as the transformation of a finite-state controller into an improved finite-state controller. The new algorithm consistently outperforms value iteration as an approach to solving infinite-horizon problems.



New Approximations of Differential Entropy for Independent Component Analysis and Projection Pursuit

Neural Information Processing Systems

We derive a first-order approximation of the density of maximum entropy for a continuous 1-D random variable, given a number of simple constraints. This results in a density expansion which is somewhat similar to the classical polynomial density expansions by Gram-Charlier and Edgeworth. Using this approximation of density, an approximation of 1-D differential entropy is derived. The approximation of entropy is both more exact and more robust against outliers than the classical approximation based on the polynomial density expansions, without being computationally more expensive. The approximation has applications, for example, in independent component analysis and projection pursuit. 1 Introduction The basic information-theoretic quantity for continuous one-dimensional random variables is differential entropy. The differential entropy H of a scalar random variable X with density f(x) is defined as H(X) - / f(x) log f(x)dx.


The Error Coding and Substitution PaCTs

Neural Information Processing Systems

A new class of plug in classification techniques have recently been developed in the statistics and machine learning literature. A plug in classification technique (PaCT) is a method that takes a standard classifier (such as LDA or TREES) and plugs it into an algorithm to produce a fier new classifier. The standard classifier is known as the Plug in Classi (PiC). These methods often produce large improvements over using a single classifier. In this paper we investigate one of these methods and give some motivation for its success.


Structure Driven Image Database Retrieval

Neural Information Processing Systems

A new algorithm is presented which approximates the perceived visual similarity between images. The images are initially transformed into a feature space which captures visual structure, texture and color using a tree of filters. Similarity is the inverse of the distance in this perceptual feature space. Using this algorithm we have constructed an image database system which can perform example based retrieval on large image databases. Using carefully constructed target sets, which limit variation to only a single visual characteristic, retrieval rates are quantitatively compared to those of standard methods. 1 Introduction Without supplementary information, there exists no way to directly measure the similarity between the content of images.


Characterizing Neurons in the Primary Auditory Cortex of the Awake Primate Using Reverse Correlation

Neural Information Processing Systems

While the understanding of the functional role of different classes of neurons in the awake primary visual cortex has been extensively studied since the time of Hubel and Wiesel (Hubel and Wiesel, 1962), our understanding of the feature selectivity and functional role of neurons in the primary auditory cortex is much farther from complete. Moving bars have long been recognized as an optimal stimulus for many visual cortical neurons, and this finding has recently been confirmed and extended in detail using reverse correlation methods (Jones and Palmer, 1987; Reid and Alonso, 1995; Reid et al., 1991; llingach et al., 1997). In this study, we recorded from neurons in the primary auditory cortex of the awake primate, and used a novel reverse correlation technique to compute receptive fields (or preferred stimuli), encompassing both multiple frequency components and ongoing time. These spectrotemporal receptive fields make clear that neurons in the primary auditory cortex, as in the primary visual cortex, typically show considerable structure in their feature processing properties, often including multiple excitatory and inhibitory regions in their receptive fields. These neurons can be sensitive to stimulus edges in frequency composition or in time, and sensitive to stimulus transitions such as changes in frequency. These neurons also show strong responses and selectivity to continuous frequency modulated stimuli analogous to visual drifting gratings.


Recurrent Neural Networks Can Learn to Implement Symbol-Sensitive Counting

Neural Information Processing Systems

Recently researchers have derived formal complexity analysis of analog computation in the setting of discrete-time dynamical systems. As an empirical constrast, training recurrent neural networks (RNNs) produces self -organized systems that are realizations of analog mechanisms. Previous work showed that a RNN can learn to process a simple context-free language (CFL) by counting. Herein, we extend that work to show that a RNN can learn a harder CFL, a simple palindrome, by organizing its resources into a symbol-sensitive counting solution, and we provide a dynamical systems analysis which demonstrates how the network: can not only count, but also copy and store counting infonnation. 1 INTRODUCTION Several researchers have recently derived results in analog computation theory in the setting of discrete-time dynamical systems(Siegelmann, 1994; Maass & Opren, 1997; Moore, 1996; Casey, 1996). For example, a dynamical recognizer (DR) is a discrete-time continuous dynamical system with a given initial starting point and a finite set of Boolean output decision functions(pollack.


Effects of Spike Timing Underlying Binocular Integration and Rivalry in a Neural Model of Early Visual Cortex

Neural Information Processing Systems

In normal vision, the inputs from the two eyes are integrated into a single percept. When dissimilar images are presented to the two eyes, however, perceptual integration gives way to alternation between monocular inputs, a phenomenon called binocular rivalry. Although recent evidence indicates that binocular rivalry involves a modulation of neuronal responses in extrastriate cortex, the basic mechanisms responsible for differential processing of con:6.icting