AITopics

Country:

Oceania > Australia (0.28)
North America > United States > California > San Diego County (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Hochreiter, Sepp, Schmidhuber, Jürgen

LSTM can Solve Hard Long Time Lag Problems

Neural Information Processing SystemsDec-31-1997

Standard recurrent nets cannot deal with long minimal time lags between relevant signals. Several recent NIPS papers propose alternative methods. We first show: problems used to promote various previous algorithms can be solved more quickly by random weight guessing than by the proposed algorithms. We then use LSTM, our own recent algorithm, to solve a hard problem that can neither be quickly solved by random search nor by any other recurrent net algorithm we are aware of.

algorithm, memory cell, sequence, (16 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.05)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hochreiter, Sepp, Schmidhuber, Jürgen

LSTM can Solve Hard Long Time Lag Problems

Neural Information Processing SystemsDec-31-1997

Standard recurrent nets cannot deal with long minimal time lags between relevant signals. Several recent NIPS papers propose alternative methods. We first show: problems used to promote various previous algorithms can be solved more quickly by random weight guessing than by the proposed algorithms. We then use LSTM, our own recent algorithm, to solve a hard problem that can neither be quickly solved by random search nor by any other recurrent net algorithm we are aware of.

algorithm, memory cell, sequence, (16 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.05)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hochreiter, Sepp, Schmidhuber, Jürgen

LSTM can Solve Hard Long Time Lag Problems

Neural Information Processing SystemsDec-31-1997

Standard recurrent nets cannot deal with long minimal time lags between relevant signals. Several recent NIPS papers propose alternative methods.We first show: problems used to promote various previous algorithms can be solved more quickly by random weight guessing than by the proposed algorithms. We then use LSTM, our own recent algorithm, to solve a hard problem that can neither be quickly solved by random search nor by any other recurrent net algorithm we are aware of. 1 TRIVIAL PREVIOUS LONG TIME LAG PROBLEMS Traditional recurrent nets fail in case'of long minimal time lags between input signals andcorresponding error signals [7, 3]. Many recent papers propose alternative methods, e.g., [16, 12, 1,5,9]. For instance, Bengio et ale investigate methods such as simulated annealing, multi-grid random search, time-weighted pseudo-Newton optimization, and discrete error propagation [3].

artificial intelligence, machine learning, memory cell, (19 more...)

Country:

North America > United States (0.47)
Europe (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Senior, Andrew W., Robinson, Anthony J.

Forward-backward retraining of recurrent neural networks

This paper describes the training of a recurrent neural network as the letter posterior probability estimator for a hidden Markov model, off-line handwriting recognition system.

neural network, probability, segmentation, (14 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.05)
North America > United States (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Frey, Brendan J., Hinton, Geoffrey E., Dayan, Peter

Does the Wake-sleep Algorithm Produce Good Density Estimators?

The wake-sleep algorithm (Hinton, Dayan, Frey and Neal 1995) is a relatively efficient method of fitting a multilayer stochastic generative model to high-dimensional data. In addition to the top-down connections in the generative model, it makes use of bottom-up connections for approximating the probability distribution over the hidden units given the data, and it trains these bottom-up connections using a simple delta rule. We use a variety of synthetic and real data sets to compare the performance of the wake-sleep algorithm with Monte Carlo and mean field methods for fitting the same generative model and also compare it with other models that are less powerful but easier to fit. 1 INTRODUCTION Neural networks are often used as bottom-up recognition devices that transform input vectors into representations of those vectors in one or more hidden layers. But multilayer networks of stochastic neurons can also be used as top-down generative models that produce patterns with complicated correlational structure in the bottom visible layer. In this paper we consider generative models composed of layers of stochastic binary logistic units. Given a generative model parameterized by top-down weights, there is an obvious way to perform unsupervised learning. The generative weights are adjusted to maximize the probability that the visible vectors generated by the model would match the observed data.

algorithm produce good density estimator, helmholtz machine, probability, (10 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Hihi, Salah El, Bengio, Yoshua

Hierarchical Recurrent Neural Networks for Long-Term Dependencies

Learning long-term dependencies is not as difficult with NARX recurrent neural networks.

dependency, long-term dependency, time scale, (15 more...)

Country:

North America > Canada > Quebec > Montreal (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Wu, Lizhong, Moody, John E.

A Smoothing Regularizer for Recurrent Neural Networks

We derive a smoothing regularizer for recurrent network models by requiring robustness in prediction performance to perturbations of the training data. The regularizer can be viewed as a generalization of the first order Tikhonov stabilizer to dynamic models. The closed-form expression of the regularizer covers both time-lagged and simultaneous recurrent nets, with feedforward nets and onelayer linear nets as special cases. We have successfully tested this regularizer in a number of case studies and found that it performs better than standard quadratic weight decay. 1 Introd uction One technique for preventing a neural network from overfitting noisy data is to add a regularizer to the error function being minimized. Regularizers typically smooth the fit to noisy data. Well-established techniques include ridge regression, see (Hoerl & Kennard 1970), and more generally spline smoothing functions or Tikhonov stabilizers that penalize the mth-order squared derivatives of the function being fit, as in (Tikhonov & Arsenin 1977), (Eubank 1988), (Hastie & Tibshirani 1990) and (Wahba 1990). Thes(-ilethods have recently been extended to networks of radial basis functions (Girosi, Jones & Poggio 1995), and several heuristic approaches have been developed for sigmoidal neural networks, for example, quadratic weight decay (Plaut, Nowlan & Hinton 1986), weight elimination (Scalettar & Zee 1988),(Chauvin 1990),(Weigend, Rumelhart & Huberman 1990) and soft weight sharing (Nowlan & Hinton 1992).

neural network, regularizer, weight decay, (13 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
North America > United States > New York (0.05)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(3 more...)

Industry: Banking & Finance (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.42)

Bengio, Yoshua, Gingras, Francois

Recurrent Neural Networks for Missing or Asynchronous Data

In this paper we propose recurrent neural networks with feedback into the input units for handling two types of data analysis problems. On the one hand, this scheme can be used for static data when some of the input variables are missing. On the other hand, it can also be used for sequential data, when some of the input variables are missing or are available at different frequencies.

experiment, input variable, recurrent network, (13 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > Quebec > Montreal (0.05)
North America > United States > California > San Mateo County > San Mateo (0.05)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Coolen, A.C.C., Laughton, S. N., Sherrington, D.

Modern Analytic Techniques to Solve the Dynamics of Recurrent Neural Networks

We describe the use of modern analytical techniques in solving the dynamics of symmetric and nonsymmetric recurrent neural networks near saturation. These explicitly take into account the correlations between the post-synaptic potentials, and thereby allow for a reliable prediction of transients. 1 INTRODUCTION Recurrent neural networks have been rather popular in the physics community, because they lend themselves so naturally to analysis with tools from equilibrium statistical mechanics. This was the main theme of physicists between, say, 1985 and 1990. Less familiar to the neural network community is a subsequent wave of theoretical physical studies, dealing with the dynamics of symmetric and nonsymmetric recurrent networks. The strategy here is to try to describe the processes at a reduced level of an appropriate small set of dynamic macroscopic observables.

analytic technique, sherrington, simulation, (14 more...)