Goto

Collaborating Authors

 Country


Note on Learning Rate Schedules for Stochastic Optimization

Neural Information Processing Systems

We present and compare learning rate schedules for stochastic gradient descent, a general algorithm which includes LMS, online backpropagation and k-means clustering as special cases. We introduce "search-thenconverge" type schedules which outperform the classical constant and "running average" (1ft) schedules both in speed of convergence and quality of solution.


Leaning by Combining Memorization and Gradient Descent

Neural Information Processing Systems

We have created a radial basis function network that allocates a new computational unit whenever an unusual pattern is presented to the network. The network learns by allocating new units and adjusting the parameters of existing units. If the network performs poorly on a presented pattern, then a new unit is allocated which memorizes the response to the presented pattern. If the network performs well on a presented pattern, then the network parameters are updated using standard LMS gradient descent. For predicting the Mackey Glass chaotic time series, our network learns much faster than do those using back-propagation and uses a comparable number of synapses.


The Tempo 2 Algorithm: Adjusting Time-Delays By Supervised Learning

Neural Information Processing Systems

In this work we describe a new method that adjusts time-delays and the widths of time-windows in artificial neural networks automatically. The input of the units are weighted by a gaussian input-window over time which allows the learning rules for the delays and widths to be derived in the same way as it is used for the weights. Our results on a phoneme classification task compare well with results obtained with the TDNN by Waibel et al., which was manually optimized for the same task.


Translating Locative Prepositions

Neural Information Processing Systems

The features used in the spatial representations were abstracted from Herskovits (1986). The network was trained using the generalized delta rule (Rumelhart, Hinton, and Williams, 1986) on a set of patterns with four components, three syntactic and one semantic. The syntactic components are a pair of nouns separated by a locative preposition [NI-LP-N21, and the semantic component is a representation of the spatial relationship [SR1.


Dynamics of Learning in Recurrent Feature-Discovery Networks

Neural Information Processing Systems

The self-organization of recurrent feature-discovery networks is studied from the perspective of dynamical systems. Bifurcation theory reveals parameter regimes in which multiple equilibria or limit cycles coexist with the equilibrium at which the networks perform principal component analysis.


Phonetic Classification and Recognition Using the Multi-Layer Perceptron

Neural Information Processing Systems

In this paper, we will describe several extensions to our earlier work, utilizing a segment-based approach. We will formulate our segmental framework and report our study on the use of multi-layer perceptrons for detection and classification of phonemes. We will also examine the outputs of the network, and compare the network performance with other classifiers. Our investigation is performed within a set of experiments that attempts to recognize 38 vowels and consonants in American English independent of speaker.


Evaluation of Adaptive Mixtures of Competing Experts

Neural Information Processing Systems

We compare the performance of the modular architecture, composed of competing expert networks, suggested by Jacobs, Jordan, Nowlan and Hinton (1991) to the performance of a single back-propagation network on a complex, but low-dimensional, vowel recognition task. Simulations reveal that this system is capable of uncovering interesting decompositions in a complex task. The type of decomposition is strongly influenced by the nature of the input to the gating network that decides which expert to use for each case. The modular architecture also exhibits consistently better generalization on many variations of the task.


The Devil and the Network: What Sparsity Implies to Robustness and Memory

Neural Information Processing Systems

Robustness is a commonly bruited property of neural networks; in particular, a folk theorem in neural computation asserts that neural networks-in contexts with large interconnectivity-continue to function efficiently, albeit with some degradation, in the presence of component damage or loss. A second folk theorem in such contexts asserts that dense interconnectivity between neural elements is a sine qua non for the efficient usage of resources. These premises are formally examined in this communication in a setting that invokes the notion of the "devil"


A B-P ANN Commodity Trader

Neural Information Processing Systems

Joseph E. Collard Martingale Research Corporation 100 Allentown Pkwy., Suite 211 Allen, Texas 75002 Abstract An Artificial Neural Network (ANN) is trained to recognize a buy/sell (long/short) pattern for a particular commodity future contract. The Back Propagation of errors algorithm was used to encode the relationship between the Long/Short desired output and 18 fundamental variables plus 6 (or 18) technical variables into the ANN. Trained on one year of past data the ANN is able to predict long/short market positions for 9 months in the future that would have made $10,301 profit on an investment of less than $1000. 1 INTRODUCTION An Artificial Neural Network (ANN) is trained to recognize a long/short pattern for a particular commodity future contract. The Back-Propagation of errors algorithm was used to encode the relationship between the Long/Short desired output and 18 fundamental variables plus 6 (or 18) technical variables into the ANN. 2 NETWORK ARCHITECTURE The ANNs used were simple, feed forward, single hidden layer networks with no input units, N hidden units and one output unit. N varied from six (6) through sixteen (16) hidden units.


Bumptrees for Efficient Function, Constraint and Classification Learning

Neural Information Processing Systems

A new class of data structures called "bumptrees" is described. These structures are useful for efficiently implementing a number of neural network related operations. An empirical comparison with radial basis functions is presented on a robot ann mapping learning task.