Information Technology
Representation and Induction of Finite State Machines using Time-Delay Neural Networks
Clouse, Daniel S., Giles, C. Lee, Horne, Bill G., Cottrell, Garrison W.
This work investigates the representational and inductive capabilities of time-delay neural networks (TDNNs) in general, and of two subclasses of TDNN, those with delays only on the inputs (IDNN), and those which include delays on hidden units (HDNN). Both architectures are capable of representing the same class of languages, the definite memory machine (DMM) languages, but the delays on the hidden units in the HDNN helps it outperform the IDNN on problems composed of repeated features over short time windows.
LSTM can Solve Hard Long Time Lag Problems
Hochreiter, Sepp, Schmidhuber, Jรผrgen
Standard recurrent nets cannot deal with long minimal time lags between relevant signals. Several recent NIPS papers propose alternative methods. We first show: problems used to promote various previous algorithms can be solved more quickly by random weight guessing than by the proposed algorithms. We then use LSTM, our own recent algorithm, to solve a hard problem that can neither be quickly solved by random search nor by any other recurrent net algorithm we are aware of.
VLSI Implementation of Cortical Visual Motion Detection Using an Analog Neural Computer
Etienne-Cummings, Ralph, Spiegel, Jan Van der, Takahashi, Naomi, Apsel, Alyssa, Mueller, Paul
Two dimensional image motion detection neural networks have been implemented using a general purpose analog neural computer. The neural circuits perform spatiotemporal feature extraction based on the cortical motion detection model of Adelson and Bergen. The neural computer provides the neurons, synapses and synaptic time-constants required to realize the model in VLSI hardware. Results show that visual motion estimation can be implemented with simple sum-andthreshold neural hardware with temporal computational capabilities. The neural circuits compute general 20 visual motion in real-time.
488 Solutions to the XOR Problem
Coetzee, Frans, Stonick, Virginia L.
A globally convergent homotopy method is defined that is capable of sequentially producing large numbers of stationary points of the multi-layer perceptron mean-squared error surface. Using this algorithm large subsets of the stationary points of two test problems are found. It is shown empirically that the MLP neural network appears to have an extreme ratio of saddle points compared to local minima, and that even small neural network problems have extremely large numbers of solutions.
Continuous Sigmoidal Belief Networks Trained using Slice Sampling
These include Boltzmann machines (Hinton and Sejnowski 1986), binary sigmoidal belief networks (Neal 1992) and Helmholtz machines (Hinton et al. 1995; Dayan et al. 1995). However, some hidden variables, such as translation or scaling in images of shapes, are best represented using continuous values. Continuous-valued Boltzmann machines have been developed (Movellan and McClelland 1993), but these suffer from long simulation settling times and the requirement of a "negative phase" during learning. Tibshirani (1992) and Bishop et al. (1996) consider learning mappings from a continuous latent variable space to a higher-dimensional input space. MacKay (1995) has developed "density networks" that can model both continuous and categorical latent spaces using stochasticity at the topmost network layer. In this paper I consider a new hierarchical top-down connectionist model that has stochastic hidden variables at all layers; moreover, these variables can adapt to be continuous or categorical. The proposed top-down model can be viewed as a continuous-valued belief network, which can be simulated by performing a quick top-down pass (Pearl 1988).
A Constructive Learning Algorithm for Discriminant Tangent Models
Sona, Diego, Sperduti, Alessandro, Starita, Antonina
To reduce the computational complexity of classification systems using tangent distance, Hastie et al. (HSS) developed an algorithm to devise rich models for representing large subsets of the data which computes automatically the "best" associated tangent subspace. Schwenk & Milgram proposed a discriminant modular classification system (Diabolo) based on several autoassociative multilayer perceptrons which use tangent distance as error reconstruction measure. We propose a gradient based constructive learning algorithm for building a tangent subspace model with discriminant capabilities which combines several of the the advantages of both HSS and Diabolo: devised tangent models hold discriminant capabilities, space requirements are improved with respect to HSS since our algorithm is discriminant and thus it needs fewer prototype models, dimension of the tangent subspace is determined automatically by the constructive algorithm, and our algorithm is able to learn new transformations.
Rapid Visual Processing using Spike Asynchrony
Thorpe, Simon J., Gautrais, Jacques
We have investigated the possibility that rapid processing in the visual system could be achieved by using the order of firing in different neurones as a code, rather than more conventional firing rate schemes. Using SPIKENET, a neural net simulator based on integrate-and-fire neurones and in which neurones in the input layer function as analogto-delay converters, we have modeled the initial stages of visual processing. Initial results are extremely promising. Even with activity in retinal output cells limited to one spike per neuron per image (effectively ruling out any form of rate coding), sophisticated processing based on asynchronous activation was nonetheless possible.
The Neurothermostat: Predictive Optimal Control of Residential Heating Systems
Mozer, Michael C., Vidmar, Lucky, Dodier, Robert H.
The Neurothermostat is an adaptive controller that regulates indoor air temperature in a residence by switching a furnace on or off. The task is framed as an optimal control problem in which both comfort and energy costs are considered as part of the control objective. Because the consequences of control decisions are delayed in time, the N eurothermostat must anticipate heating demands with predictive models of occupancy patterns and the thermal response of the house and furnace. Occupancy pattern prediction is achieved by a hybrid neural net / lookup table. The Neurothermostat searches, at each discrete time step, for a decision sequence that minimizes the expected cost over a fixed planning horizon.
A New Approach to Hybrid HMM/ANN Speech Recognition using Mutual Information Neural Networks
Rigoll, Gerhard, Neukirchen, Christoph
This paper presents a new approach to speech recognition with hybrid HMM/ANN technology. While the standard approach to hybrid HMMI ANN systems is based on the use of neural networks as posterior probability estimators, the new approach is based on the use of mutual information neural networks trained with a special learning algorithm in order to maximize the mutual information between the input classes of the network and its resulting sequence of firing output neurons during training. It is shown in this paper that such a neural network is an optimal neural vector quantizer for a discrete hidden Markov model system trained on Maximum Likelihood principles. One of the main advantages of this approach is the fact, that such neural networks can be easily combined with HMM's of any complexity with context-dependent capabilities. It is shown that the resulting hybrid system achieves very high recognition rates, which are now already on the same level as the best conventional HMM systems with continuous parameters, and the capabilities of the mutual information neural networks are not yet entirely exploited.